#DataPipelines

2025-04-19

Troubleshooting Airflow SQL Server Errors: A Case Study of System Task Initialization Failure
Troubleshoot frustrating Airflow SQL Server errors! This post helps diagnose & fix system task initialization failures and other connection issues. Learn to build robust data pipelines.
tech-champion.com/database/sql...

2025-04-18

Shifting Left isn’t just a buzzword - it’s the foundation for efficiency in your organization!

By making clean, reliable, and accessible data available across your organization, you reduce complexity and unlock time to focus on higher-value work.

💡 Data products are the foundation of this #ShiftLeft, enabling healthy, scalable data communication.

📖 Dive into the details in the #InfoQ article: bit.ly/3WHjxsf

#SoftwareArchitecture #DataMesh #DataLake #DataPipelines #ETL

2025-03-29

Troubleshooting Airflow SQL Server Errors: A Case Study of System Task Initialization Failure
Troubleshoot frustrating Airflow SQL Server errors! This post helps diagnose & fix system task initialization failures and other connection issues. Learn to build robust data pipelines.
tech-champion.com/database/sql...

2025-02-05

✅ Just finished Module 2 of the Zoomcamp!
I’ve built workflow automation with Kestra and learned how to schedule and monitor jobs.
Loving the simplicity of its declarative approach!

2025-02-05

💡 Exploring Kestra in the Zoomcamp!
Its YAML-based workflows make automation intuitive and scalable.
I just scheduled my first data pipeline—this tool is promising!

2025-02-05

🚀 I’ve just started Module 2 - Workflow Orchestration in the Zoomcamp!
This week is all about Kestra, a modern workflow orchestrator.
Looking forward to automating data pipelines with it! 🔥

2025-01-31

Atlassian introduced Lithium - an in-house #ETL platform designed to meet the requirements of dynamic data movement.

Lithium simplifies cloud migrations, scheduled backups, and in-flight data validations with ephemeral pipelines and tenant-level isolation - ensuring efficiency, scalability & cost savings.

📢 InfoQ spoke with Niraj Mishra, Principal Engineer at Atlassian, about Lithium’s implementation and future.

🔗 Read more here: bit.ly/415RPYZ

#DataPipelines #KafkaStreams #ApacheKafka #ApacheFlink #SoftwareArchitecture

#InfoQ

2025-01-31

A #ShiftLeft approach to #DataProcessing relies on data products, which form the basis of data communication across the business.

This addresses many flaws in traditional data processing and makes data more relevant, complete, and trustworthy.

#InfoQ article: bit.ly/3WHjxsf

#SoftwareArchitecture #DataMesh #DataLake #DataPipelines #ETL

2024-11-25
2024-11-25

HIRING: Data Engineer / Remote (anywhere in the U.S.)
💰 USD 100K+

👉 aijobs.net/J761696/

#Airflow #AWS #CICD #ComputerScience #Dagster #Datapipelines #dbt #Docker #ELT #Engineering

2024-11-10
2024-11-10

HIRING: Data Engineer / Remote (anywhere in the U.S.)
💰 USD 100K+

👉 aijobs.net/J761696/

#Airflow #AWS #CICD #ComputerScience #Dagster #Datapipelines #dbt #Docker #ELT #Engineering

Python Job Supportpythonjobsupport
2024-10-02

Common Pitfalls in Building Data Pipelines: Avoid These Mistakes!

In this video, we'll discuss some common pitfalls that many people encounter when building data pipelines. Avoiding these ... source

quadexcel.com/wp/common-pitfal

2024-07-02

In the pipeline: July 2024 edition! 🔶

This month we review the latest releases across the Kedro ecosystem, celebrate the PyData London workshop delivered by two TSC members, and much more.

Will you be at EuroPython in Prague next week? Don't miss Juan Luis' workshop about MLOps! (And follow @europython)

Full blog post:

kedro.org/blog/in-the-pipeline

#kedro #python #pydata #datascience #mlops #kedroviz #datapipelines #europython #EuroPython2024

2024-05-07

In the pipeline: May 2024 edition! 🔶

This month we published several releases across the Kedro ecosystem, got featured in O'Reilly's new book "Software Engineering for Data Scientists", published more videos in our YouTube channel, and got accepted at several Python conferences.

Are you at @pycon? Don't miss Juliana Ferreira's talk about Kedro!

Full blog post:

kedro.org/blog/in-the-pipeline

#kedro #python #pydata #datascience #mlops #kedroviz #datapipelines

2024-04-26

#CaseStudy - Discover how #Yelp reworked its data streaming architecture with #ApacheBeam & #ApacheFlink!

The company replaced a fragmented set of data pipelines for streaming transactional data into its analytical systems, like Amazon Redshift and in-house data lake, using Apache data streaming projects to create a unified and flexible solution.

Dive into the details: bit.ly/3WgkTL7

#InfoQ #SoftwareArchitecture #EventDrivenArchitecture #DataPipelines #Streaming

2024-04-02

New Ventana Research Analyst Perspective: Data Pipelines Integrate Data Processing and Enable AI

mattaslett.ventanaresearch.com

The development, testing and deployment of #datapipelines is a fundamental accelerator of data-driven strategies

We have added another entry to the #dataengineering glossary:

Distributed systems let data engineers process data at scale but also introduce a raft of new complexities and considerations.

Linearizability ensures that operations maintain a logical order of execution.

Learn more with a Python example here:
dagster.io/glossary/linearizab

#datapipelines

It's been interesting watching many larger enterprises build DSLs—Domain Specific Languages—on top of their orchestration solution. By implementing DSLs, data teams can open their data platform to many more users without compromizing on standards.

We wrote this up in a blog post to share the insights and approaches.

dagster.io/blog/scale-and-stan

#data #dataengineering #datapipelines

2024-02-05

Explore #QuixStreams - an #opensource #Python library that makes it easy for engineers to build real-time ML pipelines without having to learn the intricacies of building a streaming application from scratch.

Learn more about the magic behind it: bit.ly/483oT3U

This #InfoQ talk is relevant for data scientists, ML engineers, and software engineers who are looking to adopt new technologies and practices to build real-time ML pipelines and stay current in their field.

#AI #ML #DataPipelines

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst