#FileFormatIdentification

✨Bringing light to #FileFormats

Tough on the outside with food for thought inside!
File formats reveal a lot about the different meanings of an object.
CPP-008 and CPP-010 help you uncover them through #FileFormatIdentification and #FileFormatValidation:
tiny.cc/cpp-desc

#EOSCEDEN #CPPs #digitalpreservation #FAIRdata

#Digital ⚓️ #Vagabond 🦈beet_keeper@digipres.club
2025-08-18

JSONL now supported by JSONID and the first two JSONL rulesets making 80 strong in the registry. JSON, YAML, TOML, fully enabled. Check it out here:

github.com/ffdev-info/jsonid

ffdev-info.github.io/jsonid/re

#JSONID #FileFormatIdentification #FileFormats #JSONL #digipres

Using a custom Wikibase with Siegfried


by @beet_keeper

In March I was invited by the LD4 Wikidata Affinity Group to talk about my experiences using Wikibase with Siegfried, the file format identification tool. I don’t think I’ve talked about that work on here before but you can find links to my iPRES talk on my ORCID page.

Let’s look at the abstract and the content of the talk below.

Continue reading “Using a custom Wikibase with Siegfried”

#digipres #DigitalPreservation #DROID #FileFormatIdentification #FileFormats #OpenData #OpenSource #outreach #siegfried #talks #wikibase #wikidata

What information is in a file format identification report?


by @beet_keeper

In early 2022, I was finally able to get around to writing a paper that I had been thinking about for the better part of a decade. The paper, “Fractal in Detail: What Information Is in a File Format Identification Report?” was published in the Code4Lib journal Issue 53.

The paper takes a deep dive into the fractal contents of file format identification reports exported from tools like Siegfried and DROID.

Let’s take a brief look the article and its contents below.

Continue reading “What information is in a file format identification report?”

#code4lib #code4libJournal #digipres #digitalPreservation #droid #fileFormatAnalysis #fileFormatIdentification #fileFormats #filedriller #formatIdentification #freud #linting #metadata #preservationMetadata #pronom #puid #puids #siegfried #staticAnalysis #technicalMetadata

Abstract from Fractal in Detail: What information is in a file format identification report from the Code4Lib Journal.Abstract from Fractal in Detail: What information is in a file format identification report from the Code4Lib Journal.

File formats as Emoji: 0xffae


by @beet_keeper

tldr: https://emoji.exponentialdecay.co.uk

File Formats As Emoji (0xFFAE or 0xffae) might be my most random file format hack yet. Indeed, it is a random page generator! But it generates random pages of file formats represented as Emoji.

The idea came in 2016 with radare releasing a new version that supported an emoji hexdump. I wondered whether I could do something fun combining file formats and the radare output to create a web-page.

Along came a spare moment one weekend, some pyscript, and bit of sqlite, et voilà. File Formats as Emoji (0xFFAE) was made a reality.

Continue reading “File formats as Emoji: 0xffae”

#0xffae #Code #Coding #digipres #digitalLiteracy #DigitalPreservation #emoji #FileFormat #FileFormatIdentification #FileFormats #learning #PRONOM #pyscript #Python #SkeletonTestCorpus #teaching

A screenshot of a file format (fmt/983) in 0xffae. The title 0xffae sits over the top of the original image.An image of the skeleton suite in Linux. The file extensions trigger display of the correct application icon for each of the files and so it looks like you have a lot of different file types around.

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst