TabPFNv2, the supposed few-shot wunderwaffe for tabular data, got pulled apart by Yandex researchers — and what they found was underperformance, inferiority to baselines, and shaky calibration.
Sometimes hype needs a gradient descent check. 📉
TabPFNv2, the supposed few-shot wunderwaffe for tabular data, got pulled apart by Yandex researchers — and what they found was underperformance, inferiority to baselines, and shaky calibration.
Sometimes hype needs a gradient descent check. 📉
A new model ViaSHAP powered by Kolmogorov Arnold Networks outperforms XGBoost on tabular data.
Paper from Henrik Bostroem (author of Crepes conformal prediction package) KTH research group.
Make sure your models are calibrated to avoid costly mistakes.
Make sure your models are calibrated to avoid costly mistakes.
🚨 Breaking news: Java developers discover Pandas rip-off with a name that sounds like a cheap gym 💪. #Fahmatrix promises to make tabular data “fahm” easy, but let's be real—it's just another way to make Java developers wish they had chosen Python instead 🐍.
https://github.com/moustafa-nasr/fahmatrix #JavaDevelopers #PandasRipOff #TabularData #PythonVsJava #HackerNews #ngated
⭐️ What you’ll learn ⭐️
✅ Detecting and fixing errors in tables
• Learn to work with #tabulardata
• Don’t get lost when validating your spreadsheet
• Clean up your spreadsheets to gain valuable insights
👉🏾 More: https://buff.ly/9XVJxq4
🧵
If you're still using XGBoost in 2025, you're basically sending faxes in the age of fiber.
Boost smarter. Ditch the fossil.
Design patterns for presenting and manipulating tabular data.
"Datatable Design Patterns"
https://bootcamp.uxdesign.cc/data-table-design-patterns-4e38188a0981
Once you apply SMOTE, the Grim Reaper visits your dataset.
He doesn’t take lives — just precision, recall, and your dignity.
Your model performance? Undead, but not in a good way.
A new bold paper from winged hussars.
⭐️ What you’ll learn ⭐️
✅ Detecting and fixing errors in tables
• Learn to work with #tabulardata
• Don’t get lost when validating your spreadsheet
• Clean up your spreadsheets to gain valuable insights
🧵
Let's make sure our models are properly calibrated.
Data Scientist: Absolutely. It’s crucial for our success.
When it comes to #tabulardata #catboost rules supreme, in probabilistic forecasting competition most of top winning submissiones used CatBoost.
In fact, a recent paper once again confirms CatBoost's dominance with tabular data, while XGBoost came in at just … number 10.
“AComprehensive Benchmark of Machine and Deep Learning Across Diverse Tabular Datasets”
Just because thousands review these papers doesn't mean they know what actually works. Real life practitioners and experts often don’t have neither the time nor the incentives to serve as unpaid reviewers.