#datasets

2025-05-25

Berkeley Lab: Computational Chemistry Unlocked: A Record-Breaking Dataset to Train AI Models has Launched. “Today, Open Molecules 2025, an unprecedented dataset of molecular simulations, was released to the scientific community, paving the way for development of machine learning tools that can accurately model chemical reactions of real-world complexity for the first time. This vast resource, […]

https://rbfirehose.com/2025/05/25/computational-chemistry-unlocked-a-record-breaking-dataset-to-train-ai-models-has-launched-berkeley-lab/

💧🌏 Greg CocksGregCocks@techhub.social
2025-05-13

Call For Manuscript Submissions - Real-Time GIS For Disaster Management
--
nature.com/collections/bjdhbfi <-- shared link to submission details
--
[note that I have NO affiliation with this journal, the guest editors, etc]
[I wonder if anybody from FEMA has compiled use case / effectiveness / robustness on/of the #WaffleHouseIndex in the southern USA, especially related to hurricanes?]
#GIS #paper #mapping #spatial #manuscripts #callforpapers #callformanuscripts #submissions #callforsubmissions #realtime #disaster #management #mitigation #prevention #preparedness #response #recovery #risk #hazard #naturalhazard #naturalhazard #emergency #remotesensing #earthobservation #satellite #drone #sensor #socialmedia #WaffleHouseIndex #datasets #AI #InternetOfThings #research #monitoring #evacuation #planning #resourceallocation #hazardmapping #realworld #global

2025-05-10

Just like weekends restore balance and give us time to tidy up, rest, and plan, data stewards bring structure and sustainability to the data ecosystem. They ensure data is well-documented, properly formatted, ethically managed, and ready for reuse, so that when the next “Monday” (or dataset) arrives, everything is in place.

Without weekends, burnout creeps in. Without data stewards, data chaos takes over. ✌️

2025-05-07

The National Oceanic and Atmospheric Administration (#NOAA) has laid down its "snow and ice data products [#datasets, #databases] from the Coasts, Oceans, and Geophysics Science Division (#COGS)."
nsidc.org/data/user-resources/

#Climate #DefendResearch #OpenData #Takedowns #Trump #TrumpVResearch #USPol #USPolitics

Audio Developer Conferenceaudiodevcon
2025-05-05

Scalable, Efficient Processing and Analysis of Large Audio Datasets – Pawel Cyrta – ADCx Gather 2024
youtube.com/watch?v=lHME1l9cEPk

OpenAIREOpenAIRE
2025-04-28

Ready to supercharge your profile?

With + @ORCID_Org you can seamlessly complete your record with all your research outputs, from papers & to tools.

Backed by the @OpenAIREGraph EXPLORE identifies and matches your work, including:

Journal articles
Research data
Software & more

Read the article to learn more openaire.eu/openaire-explore-a

Visit explore.openaire.eu to make your contributions count publicly and properly.

OpenAIRE EXPLORE & ORCID Integration:
Complete your Open Science ORCID profile
. CF ovennive | EXPLORE
Audio Developer Conferenceaudiodevcon
2025-04-26

Scalable, Efficient Processing and Analysis of Large Audio Datasets – Pawel Cyrta – ADCx Gather 2024
youtube.com/watch?v=lHME1l9cEPk

SidheWolfsidhewolf
2025-04-26

Many living cultures have little or no machine-readable representation. Current are massively skewed toward the Global North, English-speaking, wealthy, often historically oppressive cultures. Can’t train on the nonexistent — and much of oral, experiential, and non-digitized culture has never been formally recorded, much less structured for training. There is no way to build a truly representative “universal” from existing global datasets. The ground truth is missing.

OpenAIREOpenAIRE
2025-04-24

Ready to supercharge your profile?

With + @ORCID_Org , you can seamlessly complete your record with all your research outputs, from papers & to tools.

Backed by the @OpenAIREGraph, EXPLORE identifies and matches your work, including:

-Journal articles
-Research data
-Software & more

Log in with your ORCID → check what’s missing → sync it to your profile in just a few clicks.

Read the article: explore.openaire.eu

OpenAIRE EXPLORE & ORCID Integration: Complete your Open Science ORCID profile
2025-04-24

BBC: Inside the desperate rush to save decades of US scientific data from deletion. “No one knows when the next alert or request to save a chunk of US government-held climate data will come in. Such data, long available online, keeps getting taken down by US President Donald Trump’s administration. For the last six months or so, Cathy Richards has been entrenched in the response. She works […]

https://rbfirehose.com/2025/04/24/bbc-inside-the-desperate-rush-to-save-decades-of-us-scientific-data-from-deletion/

2025-04-21

Организация датасетов с ClearML

Как версионировать датасеты, отслеживать историю трансформаций в них? Как хранить метаданные? Как строить графики и статистики по данным? Как сделать это "по красоте" с помощью платформы ClearML

habr.com/ru/articles/902824/

#clearml #mlops #data_science #dataset #datasets #ml #ai #artificial_intelligence #artificial_neural_network

Audio Developer Conferenceaudiodevcon
2025-04-19

Scalable, Efficient Processing and Analysis of Large Audio Datasets – Pawel Cyrta – ADCx Gather 2024
youtube.com/watch?v=lHME1l9cEPk

2025-04-17

Now available on Kaggle: Wikipedia Structured Contents. It’s in early beta. “This dataset contains all articles of the English and French language editions of Wikipedia, pre-parsed and outputted as structured JSON files with a consistent schema. Each JSON line holds the content of one full Wikipedia article stripped of extra markdown and non-prose sections (references, etc.).”

https://rbfirehose.com/2025/04/17/now-on-kaggle-wikipedia-structured-contents/

Marek Pavliš 🇨🇿 🇪🇺MarekPavlis@mastodonczech.cz
2025-04-15

"...there is no #AI without #energy; at the same time, AI has the potential to transform the energy sector."

📊 This "Energy and AI" #report from the International Energy Agency (#IEA) is based on new global and regional modelling and #datasets, as well as extensive consultation with governments and regulators, the #tech sector, the energy industry and international experts.

👉 iea.org/reports/energy-and-ai

#IT #data #electricity #artificiallife

2025-04-15

National Press Foundation: Data Helps Tell Education Stories, and Journalists Need to Find the Best Sources. “Rachel Rush-Marlowe, the founder and executive director of the education policy think tank ResearchEd and a former Education Department employee, spoke with NPF’s Widening the Pipeline fellows about how to access education data and accurately represent what the statistics […]

https://rbfirehose.com/2025/04/15/national-press-foundation-data-helps-tell-education-stories-and-journalists-need-to-find-the-best-sources/

The OpenAIRE GraphOpenAIREGraph
2025-04-15

Unlock insights with the new !

Easily discover , , & across infrastructures.
- Search with precision using linked
- Find versions & related datasets
- Trace research back to funders & institutions

Start exploring today: graph.openaire.eu/docs/apis/gr

The New OpenAIRE Graph API Have Been Released
Audio Developer Conferenceaudiodevcon
2025-04-14

Scalable, Efficient Processing and Analysis of Large Audio Datasets – Pawel Cyrta – ADCx Gather 2024
youtube.com/watch?v=lHME1l9cEPk

2025-04-13

From the Data Rescue Project: the Data Rescue Tracker. “The Data Rescue Tracker is a collaborative tool built to catalog existing public data rescue efforts so that we can coordinate better across initiatives. At this stage, you can use the tool to help reduce duplication of rescue efforts. The Data Rescue Tracker aims to provide a consolidated overview of who is backing up which dataset from […]

https://rbfirehose.com/2025/04/13/the-data-rescue-tracker/

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst