#SmallPond

mariolistens 🎧mariolistens@neko.cat
2025-06-20

I am now listening to Be Here by Small Pond #SmallPond
last.fm/music/Small+Pond/_/Be+

Jesus Castagnetto 🇵🇪jmcastagnetto
2025-03-05

about , a lightweight processing framework from that uses and

github.com/deepseek-ai/smallpo

卡拉今天看了什麼ai_workspace@social.mikala.one
2025-03-05

DuckDB goes distributed? DeepSeek’s smallpond takes on Big Data

Link


📌 Summary:
DeepSeek 推出名為 smallpond 的分散式運算框架,基於 DuckDB 開發,能處理高達 100TB 以上的資料。這套系統使用 Ray Core 在 Python 層級實現分散式處理,透過 DAG 執行模型和懶惰評估策略最佳化效能。與其他框架相比,smallpond 專注於簡單性,以整個分區為單位分配工作,而非在查詢層級分解操作,提供了一種 DuckDB 從單節點擴展至分散式環境的解決方案。

🎯 Key Points:
- DeepSeek 的 smallpond 能在 30 分鐘內處理 110.5TiB 資料,平均效能達 3.66TiB/分鐘
- 其架構採用懶惰評估方式構建邏輯計劃,只在需要時才執行計算
- 使用 Ray Core 實現分散式處理,支援雜湊、均勻和隨機分區策略
- 運算分散在 Python 層級,每個分區獨立使用單獨的 DuckDB 實例
- 效能測試使用 DeepSeek 自家的 3FS 檔案系統進行,該系統針對 AI 訓練工作負載優化
- 相比 Spark/Daft 等框架,smallpond 分散整個分區而非單獨操作,架構更簡單
- DuckDB 擴展方式多元:向上擴展(單機大資源)、向外擴展(增加節點)或透過 AWS Lambda 等無伺服器函式處理

🔖 Keywords:
#DuckDB #分散式運算 #smallpond #DeepSeek #Ray

Hacker Newsh4ckernews
2025-03-02

Smallpond – A lightweight data processing framework built on DuckDB and 3FS — github.com/deepseek-ai/smallpo

N-gated Hacker Newsngate
2025-03-02

Wow, "DeepSeek Drops Distributed DuckDB"—a saga of tech buzzword salad so dense, it could stop a bullet. 🦆💨 Apparently, if you squint hard enough, and are here to revolutionize databases...for the three people who can figure out what they're for. 🚀💀
definite.app/blog/smallpond

1235412354
2022-04-06

Hey all! We are back - not only with new shows, but also with a new name and concept. We are super excited!
Starting with European Tour of Varsity Star and Inwards in May presented by Small Pond and us.
Both our website 12354.art and social media will keep you posted.
We will be posting weekly reports and announcements. So, stay tuned!
Bands who want to get in touch: Email us at booking@12354.art
Cheerio

#12354

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst