Yeah! Of course, this is still a block-sorting compression algorithm*, so you wont get much advantages over zstd or xz when dealing with datasets with more inherent entropy like binary files or whatnot, but it does miracles for text.
* Of course I know what that means. Tell you what, you tell me what you think it means, and I'll tell you if you're right. 🤣
Here's an example with non-text data, where you see that #bzip3 isn't as strong:
Pictures$ for x in cat "gzip -9" "bzip2 -9" "bzip3" "zstd --ultra -22" "xz -9e"; do $x < Hobbes.jpg |wc -c |tr "\n" "\t"; echo "$x"; done |sort -rn
3445659 cat
3444164 xz -9e
3441839 zstd --ultra -22
3439158 gzip -9
3384450 bzip2 -9
3274433 bzip3
WAIT.
WHAT.
Let's try something else...
Videos$ f="Federated Timeline.webm"; for x in cat "gzip -9" "bzip2 -9" "bzip3" "zstd --ultra -22" "xz -9e"; do $x < "$f" |wc -c |tr "\n" "\t"; echo "$x"; done |sort -rn
1231940 bzip2 -9
1231269 bzip3
1227060 xz -9e
1226931 cat
1226421 zstd --ultra -22
1226241 gzip -9
WHAT?!? THE WORLD IS BROKEN!!!
TrYiNg AgAiNnNn...
Documents$ f="Thinkpad x200 hardware maintenance manual.pdf"; for x in cat "gzip -9" "bzip2 -9" "bzip3" "zstd --ultra -22" "xz -9e"; do $x < "$f" |wc -c |tr "\n" "\t"; echo "$x"; done |sort -rn
8942833 cat
8657277 bzip2 -9
8617801 gzip -9
8592319 bzip3
8568484 xz -9e
8535244 zstd --ultra -22
Ok, that makes sense. That's what I was expecting.
YOU SAW NOTHING ELSE. DON'T ASK ME ANY MORE QUESTIONS. 🤣
P.S., here's another interesting one:
138240138 cat (large BMP file)
3768642 gzip -9
3143455 PNG format
1987020 zstd --ultra -22
1592854 bzip2 -9
1512291 bzip3
1501540 xz -9e
