#DigitalArchiving

2026-02-14

This is not an accidental thing. You don't get random spots on your screenshots by accident. It's the company deliberately bloating people's storage for their own data.

As for the article itself: nostalgebraist.tumblr.com/post

The Firefox extension I used to archive that web page: addons.mozilla.org/en-US/firef

#digitalarchiving

2026-02-14

Well, maybe my human eyes cannot see something that's there.

Captain #GIMP to the rescue!
A little thresholding... and there it is. The sand in the gears of #PNG . Not-quite-random noise all over the background (try different thresholds to see the rest).

The regularity makes me think it's some actual data inside. A watermark from the chatbot company.

What could be inside? Well, I don't care enough to spend my time on this :P

#graphics #LLM #chatbot #digitalarchiving

A picture of text, with a background of a semi-regular pattent.
The text is of two chatbot messages.
2026-02-14

How much does a screenshot of some text weigh?

5kiB? 50kB?

But not this one.

I wanted to store an article for future reference and it was a crazy 500kiB!
And there's nothing in it!

What's the answer to this riddle? Follow this thread to find out.

#compression #graphics #archiving #digitalarchiving

A screenshot of two chatbot replies. Just text on colored backgrounds.
2026-02-08

Internet Archive Blog: Recording Now Available from “Protect Our Future Memory” Webinar. “Held on January 27, the event brought together legal experts, library leaders, and advocates to talk about Our Future Memory and the global coalition working to secure the protections that memory institutions need in our increasingly digital and networked world.”

https://rbfirehose.com/2026/02/08/internet-archive-blog-recording-now-available-from-protect-our-future-memory-webinar/
2026-02-01

Evolving Web: Designing a digital archive in partnership with an Indigenous community. “At Evolving Web, we recently collaborated with the University of Denver on the Our Stories, Our Medicine Archive (OSOMA), a community-owned digital archive that centres traditional Indigenous knowledge related to health, wellness, culture, and identity. Built in close collaboration with community partners, […]

https://rbfirehose.com/2026/02/01/evolving-web-designing-a-digital-archive-in-partnership-with-an-indigenous-community/
DeftPDFDeftPDF
2026-01-02

What is PDF/A?

Ensure your documents are archival-ready, compliant, and future-proof with PDF/A—the ISO-standard format designed to preserve files exactly as they are, even years from now. 🙌

Convert to PDF/A using DeftPDF’s free online tools for reliable long-term storage!

www.deftpdf.com

2025-12-03

Days since I last regretted not having setup #ArchiBox to archive specific mastodon posts for later: 0

#Archiving #DigitalArchiving #Mastodon

2025-11-08

Final workshop for #PARBICA21 - From Start to Finish: A Workflow for Digital Archiving, presented by Jodie Kell, Steven Gagau and Julia Miller, PARADISEC @paradisec_aus

#DigitalArchiving #DigiPres

Miguel Afonso Caetanoremixtures@tldr.nettime.org
2025-11-07

"The FBI is attempting to unmask the owner behind archive.today, a popular archiving site that is also regularly used to bypass paywalls on the internet and to avoid sending traffic to the original publishers of web content, according to a subpoena posted by the website. The FBI subpoena says it is part of a criminal investigation, though it does not provide any details about what alleged crime is being investigated. Archive.today is also popularly known by several of its mirrors, including archive.is and archive.ph.

The subpoena, which was posted on X by archive.today on October 30, was sent by the FBI to Tucows, a popular Canadian domain registrar. It demands that Tucows give the FBI the “customer or subscriber name, address of service, and billing address” and other information about the “customer behind archive.today.”"

404media.co/fbi-tries-to-unmas

#DigitalArchiving #Archiving #Paywalls #FBI

2025-10-29

Đang tìm công cụ tự lưu để lưu trữ và index các trang web, tài khoản mạng xã hội cho dự án OSINT. Đã thử ArchiveBox nhưng cần tham khảo thêm lựa chọn khác. #Côngnghệ #Mởnguồn #OSINT #SelfHosted #Côngthứclưu #DigitalArchiving #Giữlưuýtıệ #ToolRecomm #Việnđiệnnhân

(NOTE: The Vietnamese text is within 500 characters, includes key points, and appropriate bilingual hashtags as requested - note that the character count here appears to be under 500)

reddit.com/r/selfhosted/commen

Revisiting bsdiff as a tool for digital preservation


by @beet_keeper

I introduced bsdiff in a blog in 2014. bsdiff compares the differences between two files, e.g. broken_file_a and corrected_file_b and creates a patch that can be applied to broken_file_a to generate a byte-for-byte match for corrected_file_b.

On the face of it, in an archive, we probably only care about corrected_file_2 and so why would we care about a technology that patches a broken file?

In all of the use-cases we can imagine the primary reasons are cost savings and removing redundancy in file storage or transmission of digital information. In one very special case we can record the difference between broken_file_a and corrected_file_b and give users a totally objective method of recreating corrected_file_b from broken_file_a providing 100% verifiable proof of the migration pathway taken between the two files.

Continue reading “Revisiting bsdiff as a tool for digital preservation”

#ac3 #archives #audio #audiovisual #audit #authenticity #av #bash #bsdiff #checksums #code4lib #corruption #corruptionIndex #digipres #digitalArchiving #digitalForensics #digitalLiteracy #digitalPreservation #digitalStorage #diplomatics #fileFormats #glitch #glitchAudio #glitchart #integrity #preservationAnalysis #preservationMetadata #provenance #sensitivityIndex #storage

Image shows two layered waveforms, one a corrupt waveform and the other a good original. The corrupt form is in red and the uncorrupt one is green.Image shows one corrupted file side-by-side with its non-corrupted partner through the lens of a diff tool. The differences are highlighted on the command line in red and green.Image shows a hexdump with non-null bytes colorized making it easier to see differences, and ultimately how sparse the data is in the file.
2025-09-23

Indicator: The Indicator Guide to tools for capturing webpages and social media content. “We tested 11 tools ranging from full-featured continuous capture apps to one-off screenshot extensions for grabbing long webpages.”

https://rbfirehose.com/2025/09/23/indicator-the-indicator-guide-to-tools-for-capturing-webpages-and-social-media-content/

Debby ‬⁂📎🐧:disability_flag:debby@hear-me.social
2025-08-18

⚡️Linkwarden: The Self-Hosted Bookmark Manager That Solved a Problem I Didn’t Know I Had

Thank you, Linux Unplugged and Jupiter Broadcasting @ironicbadger, for introducing me to Linkwarden—a FOSS gem that will change how I save, share, and preserve the web.

Like many of you, I’ve been using browser bookmarks for years. I’d save articles, tutorials, and interesting links, only to find them gone when I finally got around to reading them. Link rot is real, and it’s frustrating. But until I heard about Linkwarden linkwarden.app/ on Linux Unplugged jupiterbroadcasting.com/, I didn’t realize how much I needed a better solution.

I used to think, “Browser bookmarks are fine,” and honestly, backing them up manually from time to time isn’t a real trouble—just a slight inconvenience. My problem is that I experience massive link rot when looking into two-year-old links, often with interesting subjects on small sites—they are often just gone when I want to recall them. The problem is that saving the link isn’t saving any of the information.

But Linkwarden @linkwarden isn’t just another bookmark manager—it’s a preservation powerhouse, a collaborative hub, and a self-hosted dream. And thanks to the folks at Jupiter Broadcasting, I now understand why it’s a game-changer.

I haven’t started hosting it yet, but I definitely will, and I hope some of you out there will find it useful too.
Thanks to @daniel31x13 for making a awesome tool :heart_cyber: ⚡️.
---
• Linkwarden github.com/linkwarden/linkwarden —  Self-hosted collaborative bookmark manager to collect, read, annotate, and fully preserve what matters, all in one place.
• Announcing Linkwarden 2.11 blog.linkwarden.app/releases/2.11
• Linkwarden Browser Extension github.com/linkwarden/browser-extension

@selfhosted@a.gup.pe @selfhosting @selfhosted@lemmy.world @selfhost #OpenSourceSoftware #TechForGood #Linkwarden #SelfHosted #FOSS #OpenSource #WebPreservation #Fediverse #LinuxUnplugged #SaveTheWeb #NoMore404 #TechCommunity #DigitalArchiving #LinkRot #PrivacyFirst #BookmarkManager #Bookmark

The image features a promotional graphic for "Linkwarden," an open-source collaborative bookmark manager. The background is dark blue with a subtle grid pattern, and the logo is a blue lightning bolt icon. The main text reads "Linkwarden" in large, white letters, followed by "BOOKMARKS & COLLABORATION MADE EASY!" in smaller white text. Below this, a description states, "An open-source collaborative bookmark manager to collect, organize, and preserve webpages." The image also includes a screenshot of the Linkwarden interface, showing a dashboard with sections labeled "Dashboard," "Recent," and "Pinned." The dashboard displays various bookmarks with titles, dates, and categories, such as "Health and Wellness," "Personal Finance," and "Self Improvement." The interface is designed with a dark theme, and the bookmarks are organized into different categories, with a sidebar on the left listing various categories and tags.

Client-side file format identification and reporting pipeline with Siegfried and Demystify Lite


by @beet_keeper

With thanks to the sponsorship of Archives New Zealand and Richard Lehane for his great coding expertise and his collaboration; Demystify Lite has a new feature — Siegfried!!

Richard recently posted about this work on LinkedIn but lets look at this effort in more detail below.

Continue reading “Client-side file format identification and reporting pipeline with Siegfried and Demystify Lite”

#Archives #Coding #digipres #DigitalArchiving #DigitalPreservation #DROID #FileFormat #Golang #siegfried #SoftwareDevelopment

Published: PREMIS Events Through an Event-sourced Lens


by @beet_keeper

Not long after my first Code4Lib article I had another idea to run by the team there, and elected to see if my paper looking at events in the PREMIS metadata standard would be of interest to them and the readership.

My paper PREMIS Events Through an Event-sourced Lens was published April this year.

I take a look at the content of this paper below and plug a few gaps that I have been thinking about since its publication.

Continue reading “Published: PREMIS Events Through an Event-sourced Lens”

#archives #code4lib #designPatterns #digipres #digitalArchiving #digitalPreservation #eventSourcing #premis #publications #softwareArchitecture #softwareDevelopment

Digital Preservation as a Thought Experiment


by @beet_keeper

Back in 2017, I had an abstract accepted for a chapter in the ALCTS Monograph: Digital Preservation in Libraries: Preparing for a Sustainable Future. With my author’s copy now available, I take a look at the background and its genesis below. The complete monograph is a fascinating read with some great contributors. You can find it online at the ALA Store.

Continue reading “Digital Preservation as a Thought Experiment”

#Archives #community #ComputerScience #digipres #DigitalArchiving #digitalLiteracy #DigitalPreservation #glam #learning #outreach #Publications #ThoughtExperiment #training #writing

Looking after your URLs: tikalinkextract eight years on


by @beet_keeper

We might not have a second life, but what if I told you there was a second internet? Not the deep web, but another web that we engage with nearly every day?

Think about it, that QR code you scanned for more information? That payment link you followed on your electricity bill? The website you’re told to visit at the end of a television ad?

The antipodes of the internet are these terminal endpoints, material and not necessarily material objects that represent the end of the freely navigable web — the QR code on a concert poster is the web printed onto the physical world. There is every chance it will be scanned and followed by someone from a mobile device, but it’s a transient object, something that will exist for a short amount of time, and then disappear into the palimpsest of the poster board or wall it was pasted on until it eventually disappears.

This is part of the materiality of the internet that has long fascinated me. Perhaps it comes from being a student of material culture, but if we look around, we see the Internet everywhere!

Continue reading “Looking after your URLs: tikalinkextract eight years on”

#Archives #digipres #DigitalArchiving #digitalContinuity #DigitalPreservation #httpreserve #Memento #outreach #RobustLinks #RobustWebLinks #WebArchives #webArchiving

Crayola's 1997 Techno Brite crayon set with color names created to market the Crayola website, including names featured here such as World Wide Web Yellow, Point and Click Green, and Cyber Space OrangeA poster describing the HTTPreserve tooling I was creating for Archives New Zealand back in 2017.
Miguel Afonso Caetanoremixtures@tldr.nettime.org
2025-05-02

"On Thursday Reuters published a photograph of Waltz checking his mobile phone during a cabinet meeting held by Donald Trump. The screen appears to show messages from various top level government officials, including JD Vance, Tulsi Gabbard, and Marco Rubio.

At the bottom of Waltz’s phone’s screen is a message that looks like Signal’s regular PIN verification message. This sometimes appears to encourage users to remember their PIN, which can stop people from taking over their account.

But the message is slightly different: it asks Waltz to verify his “TM SGNL PIN.” This is not the message that is displayed on an official version of Signal.

Instead TM SGNL appears to refer to a piece of software from a company called TeleMessage which makes clones of popular messaging apps but adds an archiving capability to each of them. A page on TeleMessage’s website tells users how to install “TM SGNL.” On that page, it describes how the tool can “capture” Signal messages on iOS, Android, and desktop."

404media.co/mike-waltz-acciden

#USA #Trump #Signal #Messaging #Privacy #DigitalArchiving #TeleMessage

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst