#DataTalksClub

2025-05-22

πŸš•πŸ’‘ The model is up and running! It predicts ride durations for NY Yellow Taxi trips, and I’m loving the MLOps journey. Now focusing on deploying the model and automating the process.

2025-05-22

πŸ“ŠπŸ’» Just completed the linear regression model to predict ride durations based on data from Jan-Feb 2023. Now on to tuning and integrating the model into a Docker container. Next steps ahead!

2025-05-22

πŸ—½πŸš– Starting with the NY Yellow Taxi dataset from Jan-Feb 2023! Preparing to build a regression model to predict ride durations. Time to dive into the data and start exploring!

2025-04-18

πŸ“Š Final product: 3 dashboards in Looker Studio with key insights on SF bike usage in 2023–2024.

From messy CSVs to visual stories β€” loving this data journey πŸš΄β€β™€οΈ

2025-04-18

πŸ” Project Goals:
β€’ Avg trip time & distance
β€’ Most common bike type
β€’ Most active user type
β€’ Peak ride hours
β€’ Most popular stations

Happy to say: mission accomplished βœ…

2025-04-18

πŸ“¦ Raw bike trip data and Bay Area counties were loaded into GCS, transformed with dbt, and stored in BigQuery.

Every piece automated with Kestra flows and IaC with Terraform πŸ’ͺ

2025-04-18

⚑ E-bikes are on the rise in SF!

This project revealed fascinating insights about how different users move around the city on bikes.

Infrastructure: Terraform
Orchestration: Kestra
Transformations: dbt
Warehouse: BigQuery

2025-04-18

πŸ‘€ Curious when people ride shared bikes most often?

I used Bay Wheels data to analyze hourly & weekly usage trends and visualized them with Looker Studio.

GCP + dbt + Kestra = smooth orchestration πŸ’‘

2025-04-18

πŸ› οΈ From raw CSVs to dashboards! Built a modern data pipeline using GCS, Terraform, Kestra, and dbt to analyze shared bike usage patterns in the SF Bay Area.

Found answers to key questions like trip duration, user types & popular stations.

2025-04-18

πŸš΄β€β™‚οΈ Just wrapped up a data engineering project analyzing Bay Wheels bike trips in the San Francisco Bay Area using real data from 2023-2024.

Used Terraform, Kestra, dbt, BigQuery & Looker Studio to build a full batch data pipeline.

2025-03-20

Mission complete! βœ… Just finished the homework: identifying the longest uninterrupted streak of taxi rides in a 5-minute window using and . Feeling proud of the progress so far! πŸš–

2025-03-20

Wrapping up Module 6 of and diving into the homework! πŸ“ The task: find the longest uninterrupted streak of taxi rides in a 5-minute window using and . Challenge accepted! πŸš–πŸ’¨

2025-03-20

Halfway through Module 6 of ! πŸ–₯️ Learning how to process real-time data streams with and . I’m now applying what I've learned to the Taxi NY Green dataset. Excited for what comes next! πŸš–βœ¨

2025-03-20

Just kicked off Module 6 of the Zoomcamp by @DataTalksClub! πŸŽ‰ It's all about with , , and . Can't wait to get hands-on with real-time data streaming! πŸ–₯οΈπŸš€

2025-02-28

πŸŽ‰ Final Step: Successfully Built Analytical Views!
The project is complete! After transforming the data, I’ve created models that serve analytical views for various queries in BigQuery. The combination of dbt and BigQuery makes data engineering a smooth ride. Grateful for all the learning in this module!

2025-02-28

πŸ”§ Optimizing Data with dbt Models
I’ve been creating dbt models for multiple queries across the Green Taxi, Yellow Taxi, and FHV datasets in BigQuery. From source tables to final reports, it's amazing to see how dbt handles dependencies, testing, and version control. Ready to run the first models!

2025-02-28

πŸ“Š Exploring dbt for Data Transformation
The journey continues! In this part of the project, I'm learning how dbt models help automate data transformation. I'm building out models in dbt for these taxi datasets to create clean, analysis-ready data in . It’s fascinating to see how everything connects!

2025-02-28

πŸš€ Started Module 4 of Zoomcamp!
Just kicked off the Analytics Engineering module and I'm diving into transforming the Green Taxi, Yellow Taxi, and FHV NY Taxi datasets loaded in . Excited to see how dbt can help create analytical views for better decision-making!

2025-02-10

This week at the Data Engineer Zoomcamp 2025 by
, we're diving into Data warehouse and Big Query.

Special mention to Michael Shoemaker for the insightful lessons, and to
Alexey Grigorev for organizing the sessions.

Let's continue this learning journey together

2025-02-10

πŸ”š Final Results & Lessons Learned
πŸ† 4th (Public LB) – RMSE: 12.2324
πŸ… 5th (Private LB) – RMSE: 9.5624
Key takeaways:
βœ” Feature engineering & selection are crucial
βœ” Encoding strategies impact model performance
βœ” Hyperparameter tuning makes a real difference! πŸš€

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst