@umphy :gitannex: #gitAnnex and :datalad: #dataLad organize the data and sync it to our own :forgejo: #forgejoAneksajo instance. You can instantly see if everything is there and worked. A great workflow!
@umphy :gitannex: #gitAnnex and :datalad: #dataLad organize the data and sync it to our own :forgejo: #forgejoAneksajo instance. You can instantly see if everything is there and worked. A great workflow!
Nach drei Tagen Workshop zu #DataLad in Aachen - danke nochmals an @lukascbossert und Team des @dkz2r für die tolle Organisation - heute ein Training zu #Moodle... Leider muss ich mich jetzt wieder selbst ums Catering kümmern... Zum Glück hat es etwas abgekühlt (zumindest hier bei mir sind es sommerlich "normale 19°C bei Wolken. Herrlich!)
@AvSchroeder @lukascbossert @mih @dkz2r @adswa @doktorpanik @jsheunis @abcdj @nfdi4objects @nfdi4ing @WiNoDa @NFDI
Sounds like it was a fantastic workshop – thanks for sharing your impressions! Kudos to the #DataLad team! 👏
@lukascbossert @mih @dkz2r @adswa @doktorpanik @jsheunis @abcdj @nfdi4objects @nfdi4ing @WiNoDa @NFDI it was a great workshop and if our heads smoked it was only due to the heat - the #DataLad Team did an excellent job of explaining, repairing botched attempts, and even bugfixing while remaining calm, patient and friendly the whole time -- I enjoyed it a lot.
Lots to process now...
@mih @dkz2r @adswa @doktorpanik @jsheunis @abcdj @nfdi4objects @nfdi4ing @WiNoDa Last day of our 3-days workshop at the IT Center of the #RWTH Aachen. Today everyone is diving deeper in the realm of #datalad and applying it to the individual usecases. BIG THANK YOU to the whole #datalad team for making this possible and supporting us. Voting for #datalad as @NFDI - Base service: #RDM ❤️ #datalad.
@mih @dkz2r @adswa @doktorpanik @WiNoDa @nfdi4objects @abcdj
Since there is so much potential - who has started working on the integration of #DataLad into #Emacs ? A nice addition could be something the #casual suite / #transient menu.
@mih @dkz2r @adswa @doktorpanik @WiNoDa @nfdi4ing @nfdi4objects @abcdj
We are continuting on day 2 and learn about #data #anotation with #datalad. The sky is the limit – so much potential! Great fun learning about it.
There is now a #gitAnnex package on #PyPi: https://pypi.org/project/git-annex/
This should make it simpler to deploy git-annex in Python virtual environments, also as versioned dependencies for software like #Datalad
Packages are built for Linux, Windows, and Mac via GitHub actions: https://github.com/psychoinformatics-de/git-annex-wheel/
Contributions to cover more platforms are most welcome!
I want a build system that:
- is as powerful and flexible as #SCons
- as readable and concise as #SnakeMake
- has a fricking progress bar+ETA
- is :datalad: #datalad / :gitannex: #gitannex agnostic (knows that files can be fetched from elsewhere
- remembers how long building things takes
- balances that to decide if rebuilding locally instead of fetching gigabytes via slow internet is favorable
- integrates well with :nixos: #nix for reproducibility
In the latest DataLad blog post I try out two changes which were introduced in git-annex within the last year: git-remote-annex Git remote helper (this is the big one!) and a small change to enabling WebDAV special remotes. They work brilliantly, and combined they enable read-only data publishing on Nextcloud instances.
✨ Join the next upcoming Mannheim Open Science Meetup! ✨
🗞️ Topic: Reproducible Research Data Management with @datalad
🗣️ Speaker: @lnnrtwttkhn
📅 Date: Wed, Feb 26, 2025
⏰ Time: 2:00 PM
📍 Location: Online, sign up here: https://uni-mannheim.zoom-x.de/meeting/register/u5wpc-ygqDIpH9Z8JRmpRDnkMg1Si9uXnx7h
Why Attend?
✔️ Learn cutting-edge tools like Git, Docker & DataLad
✔️ Boost transparency & reproducibility in research
#OpenScience #ResearchDataManagement #DataLad #Reproducibility
Just set up a new Synology NAS box and installed forgejo-aneksajo (a git web UI with built-in git-annex support) on it: https://effigies.gitlab.io/posts/forgejo-aneksajo-synology/
Just a quick post that highlights what needed to be adapted from this earlier post on the #DataLad blog: https://mas.to/@mih/112880585950408351
@khinsen In the MOOC #ReproducibleResearch II, Do you speak about #DataLad for managing data set?
Well, git-annex is very nice but somehow the plumbing and so which porcelain? ;-)
https://www.fun-mooc.fr/en/courses/reproducible-research-ii-practices-and-tools-for-managing-comput
https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain
@lnnrtwttkhn Look e.g. at the :datalad: @datalad blog:
https://blog.datalad.org/posts/forgejo-aneksajo-podman-deployment/
My :forgejo: #forgejo :nixos: #nixos config is located here:
@nobodyinperson The `datalad-fuse` extension allows you to use `datalad fsspec-head` to achieve this. I believe it uses git-annex to find a remote URL and then Python's `fsspec` to do the actual fetch. #DataLad
In a new article, I take a look at #Forgejo for hosting laaarge #Datalad datasets. I am talking about datasets with millions of files. Or rather millions of #gitAnnex file pointers.
...and...
It works really nice, right out of the box! Millions of files in thousands of datasets. Not even a reason to switch away from a #SQLite database. A dual-core VM with 2-3GB of RAM should be good enough.
https://blog.datalad.org/posts/forgejo-aneksajo-large-datasets/
A new post on DataLad blog: sharing my experiences from implementing a DataLad workflow, inspired by the existing "FAIRly big" paper, to cut and publish conference videos, on a cluster.
Looking back, everything seems streamlined and logical, but getting there involved discovering the fine details of DataLad, git-annex, HTCondor, bash (and also Matroska metadata and video codecs). Hope it's an useful take.
https://blog.datalad.org/posts/fairly-big-video/
#datalad #gitAnnex #HTCondor #metadata #workflow #distribits
Here is another blog post on #Forgejo. This time looking into a user-space deployment with #podman and #systemd.
This combination really rocks! It feels like managing any other non-containerized service. The integration with podman v4.4+ (quadlets) is even better.
https://blog.datalad.org/posts/forgejo-aneksajo-podman-deployment/