#SparkSQL

2025-03-12

🌟 Just wrapped up the homework for Batch 5 of the Zoomcamp!
I processed and analyzed the yellow_tripdata_2024-10.parquet and taxi_zone_lookup.csv datasets using PySpark and Spark SQL. Feels great to finish a hands-on project! πŸ†

2025-03-12

πŸ“ˆ Spark SQL is amazing!
Today I worked on SQL queries within PySpark to analyze and transform large datasets. This is such a powerful tool for data engineering! πŸš€

Anita Graser πŸ‡ͺπŸ‡ΊπŸ‡ΊπŸ‡¦πŸ‡¬πŸ‡ͺunderdarkGIS@fosstodon.org
2024-02-04

program to be used, for example, in a #streaming environment.

Other MEOS bindings include #Java with #JMEOS, for C# with #MEOS.NET and for #SparkSQL.

2023-09-21

I feel like a #sparksql #databricks for the SQL Server professional talk is going to come out of this client engagement. Which will be my first talk in 5? years

Basic premise is much of what you're writing today for SQL Server translates just fine to Databricks.

There's little differences like limit vs top, bigger differences (truncate doesn't reset the identity seed) and Toto, we're not in Kansas any more with stuff like GROUP BY ALL (which is hot for lazy typists like me)

BirderScottsfraser
2022-11-30

my cluster
is bigger than
your spark cluster

(hopefully no correlation to my inefficiency)

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst