Lmst

Discover how data governance shapes business success. Real stories, best practices, and debate on data quality and security. #DataGovernance #DataQuality #DataSecurity #DataStewardship #DataManagementBestPractices #DataCompliance #DataOwnership #DataLineage #DataCulture #DataDrivenDecisions #DataAudit #DataTrust #DataProtection #DataStewardshipTeam #DataDictionary
https://medium.com/@sanjay.mohindroo66/data-governance-best-practices-ensuring-data-quality-and-security-62cc1aae0f1f

Data lineage vergroot vertrouwen in overheidsdata

Overheden maken vaak gebruik van data om beleid te maken, dienstverlening te verbeteren en maatschappelijke vraagstukken aan te pakken. Maar hoe weet je of die data betrouwbaar is? Volgens een nieuw rapport van het Wetenschappelijk Onderzoek- en Documentatiecentrum (WODC) kan data lineage daarbij helpen.

Wat is data lineage?

Data lineage betekent letterlijk ‘afstamming van data’. Het gaat om het in kaart brengen van de volledige reis die data aflegt: van het moment dat het wordt verzameld (bijvoorbeeld via een formulier), tot aan de verwerking, bewerking en het uiteindelijke gebruik in bijvoorbeeld dashboards of rapportages. Met data lineage kun je nagaan:

waar de data vandaan komt;
welke bewerkingen of transformaties zijn toegepast;
in welke systemen of rapporten de data uiteindelijk terecht komt.

Waarom is dit belangrijk voor de overheid?

Data lineage helpt om fouten vroegtijdig te signaleren, risico’s in beeld te brengen en het vertrouwen in beleidsinformatie te vergroten, zowel binnen als buiten de organisatie. Het WODC benadrukt dat data lineage niet alleen een technisch hulpmiddel is, maar ook een stap richting professionalisering van datamanagement binnen de overheid.

Lees het nieuwsbericht van het WODC op hun website en bekijk het Engelstalige rapport.

Dit is een automatisch geplaatst bericht. Vragen of opmerkingen kun je richten aan @DigitaleOverheid@social.overheid.nl

#BetrouwbareData #DataLineage #nieuwsbrief62025 #WODC

🗺️ 𝐖𝐡𝐚𝐭 𝐘𝐨𝐮 𝐒𝐡𝐨𝐮𝐥𝐝 𝐊𝐧𝐨𝐰 𝐁𝐞𝐟𝐨𝐫𝐞 𝐈𝐦𝐩𝐥𝐞𝐦𝐞𝐧𝐭𝐢𝐧𝐠 𝐀 𝐃𝐚𝐭𝐚 𝐂𝐚𝐭𝐚𝐥𝐨𝐠. Implementing a data catalog is a necessity if you want to leverage your data. While the allure of cutting-edge technology is strong, the success hinges on a solid foundation of non-technical considerations.

👉 Read our guide & explore what you need to know to avoid common pitfalls and ensure success.
https://www.datalumen.eu/should_know_before_implementing_datacatalog/

#DataCatalog #DataGovernance #DataManagement #DataLineage #MetaDataManagement #DataAgenda #DataStrategy

WHAT YOU SHOULD KNOW BEFORE IMPLEMENTING A DATA CATALOG

"AI is all about data. Reams and reams of data are needed to train algorithms to do what we want, and what goes into the AI models determines what comes out. But here’s the problem: AI developers and researchers don’t really know much about the sources of the data they are using. AI’s data collection practices are immature compared with the sophistication of AI model development. Massive data sets often lack clear information about what is in them and where it came from.

The Data Provenance Initiative, a group of over 50 researchers from both academia and industry, wanted to fix that. They wanted to know, very simply: Where does the data to build AI come from? They audited nearly 4,000 public data sets spanning over 600 languages, 67 countries, and three decades. The data came from 800 unique sources and nearly 700 organizations.

Their findings, shared exclusively with MIT Technology Review, show a worrying trend: AI's data practices risk concentrating power overwhelmingly in the hands of a few dominant technology companies."

https://www.technologyreview.com/2024/12/18/1108796/this-is-where-the-data-to-build-ai-comes-from/

#AI #GenerativeAI #AITraining #DataLineage

From Chaos to Clarity? 🔍Find out how you can make data lineage simple. Data moving through complex architectures doesn’t have to be a mystery. 🔆 Check out our latest blog to learn how OpenLineage brings order to your data stack!

👉 Read more to be informed:
https://www.datalumen.eu/openlineage/

#OpenLineage #DataLineage #DataCompliancy #DataGovernance #DataPipelineMonitoring #MetadataManagement

#ModelExplainability, #DataLineage, and editing the #TrainingData set are topics that will be in the news next year…assuming we make it.
https://social.lol/@rom/112543674749743641

𝐔𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠 𝐭𝐡𝐞 𝐒𝐩𝐞𝐜𝐭𝐫𝐮𝐦 𝐨𝐟 𝐃𝐚𝐭𝐚 𝐋𝐢𝐧𝐞𝐚𝐠𝐞 𝐀𝐧𝐚𝐥𝐲𝐬𝐢𝐬

#Datalineage analysis is the backbone of #datagovernance, its the journey of data from origin to consumption. It not only ensures #dataintegrity & #compliance but also aids in decision-making processes & enhances data-driven strategies. Within the realm of data lineage analysis, various methodologies & approaches exist, each tailored to specific needs & objectives: https://www.foxconsulting.co/post/understanding-the-spectrum-of-data-lineage-analysis

#dataflow #dataquality

"[#DataAnalysts]..should know how the data was born, with all details of measurement... Few things have more devastating consequences ... than someone in the audience pointing out...measurement issues the analyst didn't consider." Békés and Kézdi, 2021: Data Analysis for Business, Economics, and Policy

If you're having trouble helping your org understand the value of #datalineage and #metadata, share this with them and ask if they know how all the data they're using was gathered and measured.

I wrote about the Lineage Diff for dbt projects feature of PipeRider:

You can compare then lineage DAG from both and after making code changes in dbt. It's really useful for debugging issues/seeing impact etc:

https://medium.com/inthepipeline/dbt-data-lineage-diff-impact-analysis-visualized-bec9927b0c4e

#DataOps #DataLineage #DataViz #DataQuality #DataTesting #DataEngineering

Looking for options to track #datalineage on #AWS while processing it via MWAA DAGs. Other than airflow's own lineage feature and solutions like #openlineage what else does the community use?

#DataLineage

Client Info