Lmst

For those a little familiar with Cascading, in #java, it was originally designed to run on #ApacheHadoop, and then #ApacheTez, but it also has a local planner.

This lets developers create non-clustered data applications, without the Hadoop/Tez etc dependencies or runtime.

I've been using the local planner in production for over 5 years now.

But Parquet requires Hadoop libraries, and this is ok, there is a shim between the libraries that allow Parquet and S3AFileSystem to be used locally.

#ApacheTez

Client Info