#pdf

2026-01-22

Many people seem weirdly suspicious about Google's #Brotli compression while being weirdly chill about Facebook's #ZSTD, to the point of commenting on posts about Brotli compression being added to things to the effect of, "This is a conspiracy by Google, they clearly should've chosen ZSTD instead". What's up with that? Is Google really so much less scary than 𝘍𝘢𝘤𝘦𝘣𝘰𝘰𝘬?

(This is a subtweet about a certain HN post about Brotli compression coming to #PDF)

#compression #zstandard #computers

2026-01-22

DocuSeal 2.3.0: Open-source alternative to DocuSign with AI features

DocuSeal, an open-source tool for electronic signatures, receives AI-powered form recognition and improved security in Docker environments.

heise.de/en/news/DocuSeal-2-3-

#Containerisierung #Docker #DSGVO #GitHub #IT #OpenSource #PDF #Security #news

Mme de Faune ☳Schwester_Philomena
2026-01-22

Besitzt hier jemand "Das Graveyard-Buch" von Neil Gaiman als epub oder pdf und kann mir sagen, wie ich es vernünftig angezeigt bekomme? Ich habe es vor etwa drei Jahren gekauft und kann es nicht benutzen. Durchgehend sitzen die Texte nicht an den ihnen zugehörigen Plätzen. Es ist eine Shice, die ich nicht gelöst bekomme. Gerne verschicke ich die Dateien auch, falls mir dann jemand helfen kann.

Eine Seite aus der Graphic Novel "Das Graveyard-Buch" von Neil Gaiman.
Sie zeigt sechs Panels in einem nächtlichen Haus, ein Zimmer mit Kinderbett und einen Mann, der Handschuhe trägt und wohl nichts Gutes im Schilde führt.
Die Texte zu den Bildern sind völlig verrutscht und nicht in den Sprechblasen.
2026-01-22

DocuSeal 2.3.0: Open-Source-Alternative zu DocuSign mit KI-Features

DocuSeal, ein Open-Source-Tool für elektronische Unterschriften, erhält eine KI-gestützte Formularerkennung und verbesserte Sicherheit in Docker-Umgebungen.

heise.de/news/DocuSeal-2-3-0-O

#Containerisierung #Docker #DSGVO #GitHub #IT #OpenSource #PDF #Security #news

Terence Eden’s Blogblog@shkspr.mobi
2026-01-22

Removing "/Subtype /Watermark" images from a PDF using Linux

shkspr.mobi/blog/2026/01/remov

Problem: I've received a PDF which has a large "watermark" obscuring every page.

Investigating: Opening the PDF in LibreOffice Draw allowed me to see that the watermark was a separate image floating above the others.

Manual Solution: Hit page down, select image, delete, repeat 500 times. BORING!

Further Investigating: Using pdftk, it's possible to decompress a PDF. That makes it easier to look through manually.

pdftk input.pdf output output.pdf uncompress

Hey presto! A PDF you can open in a text editor! Deep joy!

Searching: On a hunch, I searched for "watermark" and found several lines like this:

<<
/Length 548
>>
stream
/Figure <</MCID 0 >>BDC q 0 0 477 733.464 re W n q /GS0 gs 479.2799893 0 0 735.5999836 -1.0800002 -1.0559941 cm /Im0 Do Q EMC 
/Figure <</MCID 1 >>BDC Q q 28.333 300.661 420.334 126.141 re W n q /GS0 gs 420.3339603 0 0 126.1418879 28.3330078 300.6610601 cm /Im1 Do Q EMC
/Figure <</MCID 2 >>BDC Q q 16.106 0 444.787 215.464 re W n q /GS0 gs 444.7874274 0 0 216.5921386 16.1062775 -1.1281493 cm /Im2 Do Q EMC
/Artifact <</Subtype /Watermark /Type /Pagination >>BDC Q q 0.7361145 0 0 0.7361145 113.3616638 240.8575745 cm /GS1 gs /Fm0 Do Q EMC
endstream
endobj

Those are Marked Content Blocks. In theory you can just chop out the line with /Subtype /Watermark but each block has a /length variable - so you'd also need to adjust that to account for what you've changed - otherwise the layout goes all screwy.

That led me to PyMuPDF which claimed to solve the problem. But running that code only removed some of the watermarks. It got stuck on an infinite loop on certain pages.

So, now that I had more detailed knowledge, I managed to get an LLM to construct something which mostly seems to work.

Does it work with every PDF? I don't know. Does it contain subtle implementation bugs? Probably. Is there an easier way to do this? Not that I can find.

import re
import pymupdf

# Open the PDF
doc = pymupdf.open("output.pdf")

# Regex of the watermarks
pattern = re.compile(
    rb"/Artifact\s*<<[^>]*?/Subtype\s*/Watermark[^>]*?>>BDC.*?EMC",
    re.DOTALL
)

# Loop through the PDF's pages
for page_num, page in enumerate(doc, start=1):
    print(f"Processing page {page_num}")
    xrefs = page.get_contents()
    for xref in xrefs:
        cont = doc.xref_stream(xref)
        new_cont, n = pattern.subn(b"", cont)
        if n > 0:
            print(f"  Removed {n} watermark block(s)")
            doc.update_stream(xref, new_cont)

doc.save("no-watermarks.pdf")

One of the (many) problems with Vibe Coding is that trying to get a LLM to spit out something useful depends massively on how well you know the subject area. I'm proud to say I know vanishingly little about the baroque PDF specification - which meant that most of my attempts to use various "AI" tools consisted of me saying "No, that doesn't work" and the accurs'd machine saying back "Golly-gee! You're right! Let me fix that!" and then breaking something else.

I'm not sure this is the future we wanted, but it looks like the future we've got.

#LLM #pdf #python
Terence EdenEdent
2026-01-22

🆕 blog! “Removing "/Subtype /Watermark" images from a PDF using Linux”

Problem: I've received a PDF which has a large "watermark" obscuring every page.

Investigating: Opening the PDF in LibreOffice Draw allowed me to see that the watermark was a separate image floating above the others.

Manual Solution: Hit page down, select image, delete, repeat 500 times. …

👀 Read more: shkspr.mobi/blog/2026/01/remov

Soft & Appssoft_apps
2026-01-22

¿Buscas un editor de completo? 🔥

StirlingPDF es de código abierto, tiene más de 50 funciones y puedes usarlo online o con su versión de escritorio.

💻 Fusiona, convierte, recorta, protege y más.

➡️ softandapps.info/2026/01/22/st

Mac4Evermac4ever
2026-01-22

Adobe Acrobat et Express : l'IA transforme désormais vos PDF en podcasts ! (mais pas que)
mac4ever.com/194263

2026-01-22

We've just released DEVONthink To Go 4.0.4. This maintenance update adds a new default template for annotations and improves handling of placeholders, WikiLinks, and transclusions. Converting email to PDF uses proper styling. The update also fixes issues related to Spotlight, annotations, versions and downloading documents. Adding tags by using a local AI works properly now, and disabling Spotlight properly clears the Spotlight index. #devonthinktogo #devonthink #ai #spotlight #markdown #pdf

[이렇게 높은 퀄리티의 요약 서비스가 무료라고?

SecondB.ai는 유튜브 영상, 웹사이트, PDF 문서를 맥락 중심의 요약으로 제공하는 생성형 AI 기반 서비스입니다. 개발자는 직접 경험한 불편함을 해결하기 위해 서비스를 개발했으며, 현재 1,900명의 사용자를 앞두고 있습니다. 주요 기능으로는 유튜브 영상 및 PDF 문서 요약, 요약 관리 및 저장, 크롬 익스텐션 지원 등이 있습니다.

news.hada.io/topic?id=26041

#ai #summarization #youtube #pdf #context

جريدة الوطن القطريةalwatanqatar
2026-01-22

‏اقرأ في ‎
شمل برعايته الكريمة حفل تخريج الدفعة الثامنة من الطلبة المرشحين
صاحب السمو يكرم أوائل كلية الشرطة
al-watan.com‎
لتحميل العدد الكامل بصيغة PDF اليوم الخميس 22 يناير 2026

al-watan.com/PDF?date=22-01-20

🇶🇦 #

Media Japanmedia@wakoka.com
2026-01-22

wacoca.com/media/557449/ 紙書籍を「自炊」している人に福音? スキャンデータを徹底的に読みやすくする最強ツール – やじうまの杜 – 窓の杜 ##趣味 #book #books #DN_SuperBook_PDF_Converter #pdf #Windows #オフィス・ドキュメント #スキャン #ドキュメント #ライブ #学習・勉強 #書籍 #自炊

紙書籍を「自炊」している人に福音? スキャンデータを徹底的に読みやすくする最強ツール - やじうまの杜 - 窓の杜
Media Japanmedia@wakoka.com
2026-01-22

wacoca.com/media/557270/ 紙書籍を「自炊」している人に福音? スキャンデータを徹底的に読みやすくする最強ツール – やじうまの杜 – 窓の杜 ##趣味 #book #books #DN_SuperBook_PDF_Converter #pdf #Windows #オフィス・ドキュメント #スキャン #ドキュメント #ライブ #学習・勉強 #書籍 #自炊

紙書籍を「自炊」している人に福音? スキャンデータを徹底的に読みやすくする最強ツール - やじうまの杜 - 窓の杜
Arlin Schaffelfexd@arlin.org
2026-01-21

Adobe Acrobat uses AI to turn your PDFs into podcasts

image via theverge.com

Adobe has added new generative AI features to Acrobat that aim to help you quickly edit PDFs and summarize them in audio and visual formats. These updates include chat-based editing and the ability to generate personalized podcasts and presentations based on your docs, which are now available in Acrobat Studio — the AI-infused document workspace app that’s distinct from Adobe’s basic PDF reader.

https://www.theverge.com/news/864811/adobe-acrobat-studio-generate-podcast-presentation
#acrobat #adobe #ai #pdf #podcast
2026-01-21

Waw, j'en rêvais de ce super outil en ligne pour signer, cocher, etc des #pdf
framapdf.org
Merci l'équipe @Framasoft

Aidooaidoo
2026-01-21

Adobe lanza Acrobat Studio: IA para convertir PDFs en presentaciones, podcasts y editar con chat en lenguaje natural. aidoo.news/noticia/rgMyjW

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst