#FileFormats

M3Imaginationmvsiii71
2025-04-21

🌟 Understanding image file formats is crucial for quality and compatibility! šŸ“ø From JPEG for everyday photos to PNG for transparency and TIFF for high-quality printing, each format serves a unique purpose. Choose wisely to make your visuals shine!

File formats as Emoji: 0xffae


by @beet_keeper

tldr: https://emoji.exponentialdecay.co.uk

File Formats As Emoji (0xFFAE or 0xffae) might be my most random file format hack yet. Indeed, it is a random page generator! But it generates random pages of file formats represented as Emoji.

The idea came in 2016 with radare releasing a new version that supported an emoji hexdump. I wondered whether I could do something fun combining file formats and the radare output to create a web-page.

Along came a spare moment one weekend, some pyscript, and bit of sqlite, et voilĆ . File Formats as Emoji (0xFFAE) was made a reality.

Continue reading ā€œFile formats as Emoji: 0xffaeā€ā€¦

#0xffae #Code #Coding #digipres #digitalLiteracy #DigitalPreservation #emoji #FileFormat #FileFormatIdentification #FileFormats #learning #PRONOM #pyscript #Python #SkeletonTestCorpus #teaching

A screenshot of a file format (fmt/983) in 0xffae. The title 0xffae sits over the top of the original image.An image of the skeleton suite in Linux. The file extensions trigger display of the correct application icon for each of the files and so it looks like you have a lot of different file types around.
2025-04-05

@wyatt @kawa Until archive format overhead becomes the limiting factor for size.

At one point in early 2002, I briefly considered using the UNIX tape archive (tar) format to bundle assets in a Game Boy Advance game. Many of these assets were 2048 bytes or smaller. I looked up the spec for a POSIX tar file, and it involved rounding up each file's size to a multiple of the 512-byte block size and adding a 512-byte block header. That kind of overhead adds up. On top of that, searching for a particular file in a tarball takes linear time, not constant or even logarithmic time.

That led me to devise and document a simpler, more fit for purpose archive format called GBFS. Other specialized archive formats may benefit from packing files so as to avoid crossing 16K, 32K, 64K, or 128K block boundaries in the medium.

#FileFormats #gbadev #GameBoyAdvance #FileFormat #archives

File format building blocks: primitives in digital preservation


by @beet_keeper

A primitive in software development can be described as:

a fundamental data type or code that can be used to build more complex software programs or interfaces.

– via https://www.capterra.com/glossary/primitive/ (also Wiki: language primitives)

Like bricks and mortar in the building industry, or oil and acrylic for a painter, a primitive helps a software developer to create increasingly more complex software, from your shell scripts, to entire digital preservation systems.

Primitives also help us to create file formats, as we’ve seen with the Eyeglass example I have presented previously, the file format is at its most fundamental level a representation of a data structure as a binary stream, that can be read out of the data structure onto disk, and likewise from disk to a data structure from code.

For the file format developer we have at our disposal all of the primitives that the software developer has, and like them, we also have ā€œfile formatsā€ (as we tend to understand them in digital preservation terms) that serve as our primitives as well. 

Continue reading ā€œFile format building blocks: primitives in digital preservationā€ā€¦

#Archives #digipres #DigitalPreservation #DigitalPreservationEssentialism #diplomatics #eyeglass #eygl #FileFormats #InformationRecordsManagement #IRM #JSON #OpenData #OpenSource #RDM #ResearchData #ResearchDataManagement #XML

Image of the foundations of a new building being erected in Wellington New Zealand, circa 2017.Image shows a diagram of a simplistic view of the technology stack that goes into the creation of XML as a document (record)Image shows four panels, each representing a stage in the making of a cake related back to data versus records. The data layer being the eggs and flour, the information layer the freshly made cake, the presentation layer the cake with icing on top, and the knowledge layer the cake has been eaten and can be fully understood and appreciated by its consumer.
#Digital āš“ļø #Vagabond 🦈beet_keeper@digipres.club
2025-02-26

Huzzah! A small personal 15 year old goal achieved -- finally got a service to update automatically from the PRONOM release notes we first minted at TNA back in 2010 with signature file V31. (Rather than something going wrong and my needing to handle turn. We only trigger a download and update on a new release and it finally worked!)

Today's release V120 reflected live on the ffdev.info PRONOM dashboard and API.

ffdev-info.github.io/pronom-pa

#Digipres #PRONOM #DROID #Fido #Siegfried #FileFormats

Screenshot of pronom.ffdev.info with the version number correctly updated after yesterday's release.
2025-02-25

The latest version of #PRONOM, v120 has been released! Identification for 30 new PUIDs, 32 new signatures and 32 updates. #digipres #fileformats #digitalpreservation nationalarchives.gov.uk/abouta

The sensitivity index: Corrupting Y2K


by @beet_keeper

In December I asked ā€œWhat will you bitflip today?ā€ Not long after, Johan’s (@bitsgalore) Digtial Dark Age Crew released its long lost hidden single Y2K — well, I couldn’t resist corrupting it.

Fixity is an interesting property enabled by digital technologies. Checksums allow us to demonstrate mathematically that a file has not been changed. An often cited definition of fixity is:

Fixity, in the preservation sense, means the assurance that a digital file has remained unchanged, i.e. fixed — Bailey (2014)

It’s very much linked to the concept of integrity. A UNESCO definition of which:

The state of being whole, uncorrupted and free of unauthorized and undocumented changes.

Integrity is massively important at this time in history. It gives us the guarantees we need that digital objects we work with aren’t harboring their own sinister secrets in the form of malware and other potentially damaging payloads.

These values are contingent on bit-level preservation, the field of digital preservation largely assumes this; that we will be able to look after our content without losing information. As feasible as this may be these days, what happens if we lose some information? Where does authenticity come into play?

Through corrupting Y2K, I took time to reflect on integrity versus authenticity, as well as create some interesting glitched outputs. I also uncovered what may be the first audio that reveals what the Millennium Bug itself may have sounded like! Keen to hear it? Read on to find out more.

Continue reading ā€œThe sensitivity index: Corrupting Y2Kā€ā€¦

#ac3 #Archives #audio #audiovisual #authenticity #av #Bash #checksums #Code4Lib #corruption #corruptionIndex #digipres #DigitalArchiving #digitalLiteracy #DigitalPreservation #diplomatics #FileFormats #flac #glitch #GlitchArt #glitchaudio #integrity #mp3 #sensitivityIndex #wav

Image showing a hugely glitched file in Audacity. The waveforms should largely be the same in both stereo channels but they are not.A snippet of audio as shown in Audacity. The image shows the audio's waveform and spectograph.
#Digital āš“ļø #Vagabond 🦈beet_keeper@digipres.club
2025-02-08
Photo of a wall on a street in Leipzig. There is a poster of a show and above it graffiti of the letters GIF
#Digital āš“ļø #Vagabond 🦈beet_keeper@digipres.club
2025-02-06

The things that get you excited when you’ve spent too long looking at hex editors… receiving mojibaked invitations for the building’s Whatsapp group! (Although the text is strangely coherent when translated… so it could also be some random crossed wire from the provider)

#FileFormats #Encodings #Mojibake

Screenshot of a text message received from a German SMS service from a local resident. The message should be an invitation to Whatsapp and it probably is but it’s likely a mojibaked multi-byte character encoding (like UTF-16) and presents as Chinese logographs which can be common when bytes are shifted incorrectly or a byte-order marker is missing)
2025-01-28

Who's got two thumbs and her first ever #Pronom submission? This gal! For Logic Pro Project Files (LOGICX/LOGIC) new fdd640. Comments always welcome loc.gov/preservation/digital/f #fileformats #digipres

2025-01-27

thought it might be nice to sign #sphinx releases with #minisign and #ssh #eddsa keys, straight outta sphinx. minisign #privkeys are okish (they do need 40 B of entropy, 8 extra for a "keyid"). but did you know, that in ssh the public key is stored 3x in the ed25519 private #key? one time i can understand (could be 0 though), but 3 times? what have they been drinking? #fileformats

ComradeVlastcomradevlast
2025-01-22

Looking for some more advanced techies to help me out here. I was browsing the files of old abandonware (as one does) and came across the .zym file format in a game called Gubble 2. Does anyone have any idea what this file format is? Is it something proprietary by the gubble devs? Something that just isn't used anymore? The only thing google brought up regarding .zym was some mods for quake.

A list of gubble 2 files ending in the format .ZYM
2025-01-22

SUPER excited to announce a new fdd, FIRST ONE ever from Liz Caringola, on PAR (Parity Volume Set File Format Family - fdd634). We/Liz even submitted for the #PUID in #PRONOM and we'll do this as part of our workflow from now on. This format was a popular discussion topic with lots of community input. Comments always welcome but mostly if they say Go Liz! Great job! Woot! loc.gov/preservation/digital/f #digipres #fileformats

2025-01-15

@jgivoni I am not so sure about powerful. With the way I currently envision it, it would have some limitations as to what values it can represent.

For example, leading and trailing whitespace in a line in a value would not be possible, nor would be keys containing colons or keys and values containing pipes. My thinking on this as of now is that it isn't a big deal, since this is for readable configuration files, not general data storage, and the file can be designed around these limitations.

Readable and writable without any knowledge, though, is pretty much the main goal, to the point that the primary file extension for this format would probably be TXT. I want most people (who know the language the labels are written in) to be able to click on the file in their file manager, read it, understand what it means, and possibly edit it if they want to.

I am open to suggestions, though.

#fileformats #programming

2025-01-14

I am trying to make a tool (#DJJerry) that helps you manage your local music collection. I've been stuck on trying to pick a format for the configuration/manifest files.

I have even thought about inventing my own file format designed to be very clean and readable to people who don't know code (shown in image).

What file format would you personally prefer?

#seriousquestion #programming #softwaredevelopment #DJJerry #freesoftware #foss #fileformats #anticapitalism #praxis

| This file format would be intended for config files,
| not reliable data serialization.

| a single-line value
Handle: maypop_neocities
| a multi-line value
Bio:
  AuDHD 🐾 20 years old

  I draw sometimes, and I also say
  stupid things sometimes,
  and I also write code sometimes.
| a list
Custom Fields:
  | a multi-line value within a list
  ...
    | it's a tuple struct so is itself a list
    Favorite Food
    apples
  ...
    Favorite Crime
    copyright infringement
  ...
    Website
    https://maypop.neocities.org
| an enum with fields
Furosona Species: Hybrid
  Primary: Dragon
  Secondary: Llama
  Tertiary: Glow Squid
#Digital āš“ļø #Vagabond 🦈beet_keeper@digipres.club
2025-01-11

Anyone else seeing the Just Solve It Wiki error?

fileformats.archiveteam.org/wi

#digipres #FileFormats

2025-01-04

La IA generativa es un cƔncer parƔsitario.

Freya Holmer sobre la IA y la nula calidad del contenido que genera. En este caso sobre el formato binario #gltb (representación binaria del formato #gltf)

Es comedia pura. No se como #FreyaHolmer ha tenido la paciencia de hacer el vĆ­deo despues de leer el primer artĆ­culo de la busqueda.

Generative AI is a Parasitic Cancer
youtube.com/watch?v=-opBifFfsM

#khronos #blender #3D #fileformats #IA
#enshittification #enmierdamiento

What will you bitflip today?


by @beet_keeper

I want to let you into a secret: I enjoy corruption. Corrupting digital objects leads to undefined behavior (C++’s definition is fun). And flipping bits in objects can tell us something both about the fragility, and robustness of our digital files and the applications that work with them.

I had a pull-request for bitflip accepted the other day. Bitflip is by Antoine Grondin and is a simple utility for flipping bits in digital files. I wrote in my COPTR entry for it that it reminds me of shotGun by Manfred Thaller. The utility is exceptionally easy to use (and of course update and maintain written in Golang) and has some nice features for flipping individual bits or a uniform percentage of bits across a digital file.

My pull-request was a simple one updating Goreleaser and its GitHub workflow to provide binaries for Windows and FreeBSD. I only needed to use Windows for a short amount of time thankfully, but it’s an environment I believe is prevalent for a lot of digital preservationists in corporate IT environments.

Bitflip is a useful utility to improve your testing of digital preservation systems, or simply for outreach, but let’s have a quick look at it in action.

Continue reading ā€œWhat will you bitflip today?ā€ā€¦

#Archive30 #Archives #Art #Binary #bitflip #bitrot #Code #Coding #digipres #digital #DigitalArchiving #digitalLiteracy #DigitalPreservation #FileFormats #GenerativeArt #GlitchArt #outreach #Ravensburger #SomethingFun #Vagabond

Cat's Meow from the Offner Dynograph EEGAn example of bitrot using the Ravensburger mobile games logo, the image is split into four quadrants and demonstrates the different visual artefacts and color changes that result from the degradation of a bytestream.Corrupt JPEG in Image Magick's display utility

A year in file formats 2024


by @beet_keeper

A great write up from Francesca at TNA about the past year for PRONOM via Georgia at the OPF.

It’s great to see the continuing work including vital translation of guides into other languages. Francesca includes a couple of shout outs to some pieces I have contributed in my spare time this year; including a collaborative workshop with Francesca, David, and Tyler at iPRES2024.

Continue reading ā€œA year in file formats 2024ā€ā€¦

#Archives #Conferences #digipres #DigitalPreservation #DROID #FileFormat #FileFormats #ipres2024 #outreach #PRONOM

Tyler's Halloween Matryoshka Dolls represent the internal complexities of container file formats. The dolls here have formats attached to them representing different ways they might be nested, with ZIP and OLE2 being the primary containers that can be handled in DROID and Siegfried at present.A QR code that can be followed to links from the What's in the Box workshop at iPRES2024

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst