SSDs, but specially RAM prices are skyrocketing: https://arstechnica.com/gadgets/2025/11/spiking-memory-prices-mean-that-it-is-once-again-a-horrible-time-to-build-a-pc
These are excellent times for using compression to reduce your storage pressure.
SSDs, but specially RAM prices are skyrocketing: https://arstechnica.com/gadgets/2025/11/spiking-memory-prices-mean-that-it-is-once-again-a-horrible-time-to-build-a-pc
These are excellent times for using compression to reduce your storage pressure.
Interact with your vasts remote datasets right in your phone! π±
I've built a demo Jupyter notebook that connects to a Cat2Cloud server from an Android phone and slices into an 8 TB dataset, downloading a 1 MB chunk in under 100 milliseconds. β‘
The 8 TB dataset is from the Gaia DR3 catalogue. As it turns out, there are ~1000 stars in a cube of 100 light-years in our vicinity; the space is mostly empty.π π
Try this out by visiting: https://cat2.cloud/demo/roots/@public/large/slice-gaia-3d.ipynb
Our @EuroSciPy 2025 tutorial on modern Blosc2 features is now online! π
For the first time, we present our holistic view on how compression can revolutionize data handling, sharing, and computing.
Learn how Blosc2 boosts performance for large datasets, how to serve data online with Caterva2, and compute directly in the cloud.
Watch now! π https://www.youtube.com/watch?v=BdpTtzX2cuk
#Blosc2 #EuroSciPy #Python #DataScience #BigData #HPC #OpenSource
πIronPill 2π
In the second of our series of short videos ("ironPills") showcasing ironArray's work, we see how Blosc2 can be used to power heavy-duty linear algebra (100GB!) workflows
β‘1.5-2x faster than PyTorch + h5py!
π§± automated chunking optimised for your machine's cache hierarchy
π simple one-line syntax ππππππΈ.ππππππ(π°, π±, πππππππ='πππ.ππΈππ')
See blog here: https://ironarray.io/blog/la-blosc
π IronPill 1π
In the first of a series of short videos ("ironPills") showcasing ironArray's work, we see how Blosc2 can be used to calculate Fourier approximations:
β‘5x faster than NumPy
π£ fraction of the memory footprint
π pythonic one-line syntax πππ(πβ*βπππ(π)β+βπβ*βπππ(π),βππ‘ππ=π·)
See full notebook here: https://github.com/Blosc/python-blosc2/blob/main/examples/ndarray/ironpill1.ipynb
(inspired by this blog post: https://towardsdatascience.com/numexpr-the-faster-than-numpy-library-that-no-ones-heard-of/)
π£οΈ Announcing Python-Blosc2 3.8.0 π
A step closer to compliance with the array-api standard: data-apis.org/array-api!
This is an effort across all array-based libraries so that your code works (e.g. for both blosc2 and NumPy) by simply changing the import statement below!
Highlights:
β
C-Blosc2 updated to latest 2.21.2
β
Incorporate isnan, isfinite, isinf
β
Better indexing coverage
β
linspace and arange functions more numerically stable
β
Improved array-api compliance
Struggling to get performant code from LLMs? π€ They can't do the empirical, target-specific optimization needed for modern CPUs.
We can help! πͺ We've spent countless hours profiling and micro-benchmarking Blosc2 for you. Use our C/Python implementations as building blocks for your high-performance apps.
Ready to dive in? π
πΉ EuroSciPy Talk: https://www.blosc.org/docs/2025-EuroSciPy-Blosc2.pdf
πΉ Tutorial: https://github.com/Blosc/EuroSciPy2025-CCC-Tutorial
π’ Great to see the community building powerful tools on Blosc2! π
Check out compress-image: a new C++/Python library for working with compressed images directly in memory.
It allows you to keep lots of images in RAM while minimizing I/O and memory footprint.
Kudos to Emil Dohne for this fantastic work!
Project here: https://github.com/EmilDohne/compressed-image
#Blosc2 #FOSS #ImageProcessing #DataCompression #Cpp #Python
π We are thrilled to announce *TreeStore*, a new class in Python-Blosc2! It lets you give your datasets a hierarchical structure while keeping the speed and efficiency of `NDArray` instances. β‘οΈ
π We've blogged all about it here:
https://www.blosc.org/posts/new-treestore-blosc2/
It's in beta, and you can start using it now in the latest Python-Blosc2 v3.7.2. Enjoy!
π Caterva2 2025.8.7 is out!
This release features a major refactoring for a simpler, more robust system. Best of all, client APIs are unaffected. β
π Highlights:
The new cat2agent π€: A CLI client to watch a directory and auto-sync it with a Caterva2 server.
New stack and concat commands in the web UI for easier data manipulation.
Full release notes: https://github.com/ironArray/Caterva2/releases
Learn more about Caterva2: https://ironarray.io/caterva2
#Caterva2 #Blosc2 #HDF5 #DataStorage #OpenSource #DataScience
π£οΈ Announcing Python-Blosc2 3.6.1
!Unlock new levels of data manipulation with Blosc2! π
We've introduced a major improvement: powerful fancy indexing and orthogonal indexing for Blosc2 arrays.
We've tamed the complexity of fancy indexing to make it intuitive, efficient, and consistent with NumPy's behavior. πͺ
Read all about it on our blog! π https://www.blosc.org/posts/blosc2-fancy-indexing/
Compress Better, Compute Bigger!
#Blosc2 #Python #DataScience #BigData #NumPy #Performance #HPC
Thanks to the advanced double partitioning techniques in #Blosc2, our #Caterva2 package can serve small slices of big datasets (3.8 GB) through internet in less than the blink of an eye.
See how you can do that with the help of #JupyterLite in #Cat2Cloud using two different techniques:
1) Plain Python-Blosc2 library for quick and dirty access
2) Caterva2 Python client for a more heavy-duty and flexible operation
Try it out! π https://cat2.cloud/demo/roots/@public/examples/large-dataset-indexing.ipynb?roots=%40public
#Blosc2 now runs directly in your browser! Leveraging the power of #WASM, #Pyodide, and #JupyterLite, you can harness efficient, adaptable compression through the web's universal interface. Experience the future of large-scale data processing without leaving your browser window.
Compress Better, Compute Bigger, Share Faster
π’ We are pleased to announce the integration of a new stack feature in #Blosc2 π, which allows for stacking large arrays along a new axis.
Performance benchmarks show that while aligned chunks yield the best results, #Blosc2 with unaligned chunks can still outperform #NumPyβa welcome discovery! π
Many thanks to Luke Shaw for his excellent work on this new functionality. π
We've updated our recent blog post:
Check it out! π https://www.blosc.org/posts/blosc2-new-concatenate/#stacking-arrays
π’ Blosc2 just launched a super-efficient array concatenation feature! π
Combine massive arrays quickly and with minimal memory. If your array chunks are aligned, it's even faster β no need to decompress first! Perfect for big data tasks.
Check out our blog post: https://www.blosc.org/posts/blosc2-new-concatenate/
Compress Better, Compute Bigger
#DataScience #Blosc2 #DataStorage #Performance #MachineLearning
π C-Blosc2 2.18.0 is out now!
β¨ What's new:
* Introducing b2nd_concatenate() - now you can easily join b2nd arrays together!
* Fixed mmap files to flush modified pages only in write mode (thanks Jan Sellner!)
Get the full details: https://github.com/Blosc/c-blosc2/blob/main/RELEASE_NOTES.md
π Excited to share more about Caterva2, your ultimate gateway to Blosc2/HDF5 repositories! π
Caterva2 is designed to redefine how you interact with large datasets.
Want to see it in action? π€ We've just released a new introductory video showcasing Caterva2's main functionalities! π¬
π https://ironarray.io/caterva2
#Caterva2 #Blosc2 #HDF5 #BigData #DataManagement #FreeSoftware #Python #DataScience #Tech
#Python-Blosc2 is hitting 1 million weekly downloads on PyPI! π https://pypacktrends.com/?packages=blosc2&packages=blosc&time_range=2years
Users are rapidly adopting #Blosc2, which now accounts for over 95% of downloads compared to Blosc1. π This success is thanks to our amazing users and community contributors. π We're dedicated to making Python-Blosc2 even better. π
Our motto: Compress Better, Compute Bigger! πͺ
Now it's @FrancescAlted to introduce the #Blosc2 #compression algorithm to reduce #HDF5 file size.
π‘ Did you know you can supercharge your #HDF5 datasets with #Blosc2? π
Leverage hdf5plugin (https://hdf5plugin.readthedocs.io) to integrate Blosc2 as a filter within HDF5. Create, write, and read data using popular Python wrappers like h5py or PyTables, while achieving excellent performance! π¨
More speed?
* h5py users: b2h5py offers optimized reads for n-dim slices.
* PyTables users: Optimized support is already built-in.
Learn more: https://www.blosc.org/posts/pytables-b2nd-slicing/
Compress Better, Compute Bigger :-)