#csv

2025-10-13

[Перевод] Автоматический парсинг чеков с LlamaIndex и Pydantic

Команда Python for Devs подготовила перевод статьи о том, как с помощью LlamaIndex и Pydantic можно превратить сканы чеков в структурированные данные. Минимум кода — и у вас готовый CSV для анализа.

habr.com/ru/articles/953414/

#Python #LlamaIndex #Pydantic #OCR #Kaggle #Receipt #Extraction #CSV #Async #Pipeline

2025-10-12

SheetAtlas: Công cụ desktop đa nền tảng mới giúp bạn tìm kiếm và so sánh dữ liệu trên nhiều file Excel/CSV một cách dễ dàng. Được phát triển chỉ trong 1.5 tháng với sự hỗ trợ của AI (Claude Code). Đây là dự án mã nguồn mở (MIT) đang tìm kiếm người dùng thử và các phản hồi để cải thiện hiệu suất. Hãy trải nghiệm và đóng góp!

#SheetAtlas #Excel #CSV #CrossPlatform #OpenSource #AI #Productivity #PhầnMềm #MãNguồnMở #NăngSuất #CôngCụ

reddit.com/r/SideProject/comme

MathDaTech :fedora: 🤘mathdatech1@hostux.social
2025-10-12

Tablecruncher is a blazing-fast CSV editor built to handle massive files with ease, available for macOS, Windows and Linux. Need to open a 2 GB file with 16 million rows? On a Mac Mini M2, Tablecruncher does it in just 32 seconds.

Originally released in 2017 as a commercial app, Tablecruncher is now fully open source under the GPL v3 license (or later).
tablecruncher.com/

:github: github.com/Tablecruncher/table

#csv #csvediting

Copied media URI
2025-10-11

whoa there's a new #homebrew package called:

`config-file-validator`

and it can scans a variety of config files for errors:

#yaml, #editorConfig #plist (apple's flavor included), #CSV, #ini, #json, #xml #toml #env #hcl AHHHHHHHHH

I'm making shell functions for this thing already. (#zsh thanks for asking)

boeing.github.io/config-file-v

צבי הנינג'ה🟣zvinj@tooot.im
2025-10-09

For an art project I needed name/affiliation/date-of-death of the [currently 270] slain #gaza journalists in machine readable form.

The most extensive source I could find was #wikipedia, but as a table with rowspan nuisances.

So I wrote a script that scrapes that page:
nimrodkerrett.opalstacked.com/

There's also the #CSV itself if all you need is the data:
nimrodkerrett.opalstacked.com/

Hope it's useful to others as well.

Cc @aral @joynewacc

shevabamshevabam
2025-10-08

🔔 Nouvel article ! 🔔
🆕 Générer rapidement des données de test en CSV, JSON, SQ et XML

blog.shevarezo.fr/post/2025/10

@cyrilbois

Inautiloinautilo
2025-10-07


Best table format for AI · Is it Markdown, JSON, XML, CSV, or YAML? ilo.im/167fbh

_____

What is the next project?laravista@mastodon.uno
2025-10-06

🔍 / #software / #CSV / #parser/ #performance

I used #ChatGPT to help with technical definitions and finding book references. If anything seems wrong or you spot any AI hallucinations, let me know and I'll fix it. The actual GitHub project and parser code came straight from my discoveries and experiments, no AI involved there.

🐱🔗 laravista.altervista.org/CatLi

#catlink #softwareCSV #softwareCSVparser #softwareCSVparserperformance

2025-10-06

#CSV, but a set of header comments give #CSS formatting so that small tables can be copy/pasted into Mastodon etc. and render richly without having to write Satan's own table-format, WikiText

#Fediverse を調べています。
(2025年9月)

/api/v1/インスタンス/ピア
で、各domainの繋がりを調べ、
生きているdomainに対し、
/.well-known/nodeinfo
でnodeinfoに取得方法を調べ、nodeinfoの中の
domain名、アプリ名、バージョン、アカウント数(ユーザー数)、MAU(1ヶ月アクティブユーザー数)、HYAC(半年間アクティブユーザー数)、投稿数
を取得しています。

peers では 124,978件(前月 119,544件)
通信可能では 66,155件(前月 62,773件)
nodeinfo可能では 32,408件(前月 30,777件)
でした。

(前回スキャンで存在し、今回スキャンで落ちている domain について追加nodeinfo)

データは
github.com/ottoto2017/fediverse

target20250921.csvで upload

#prattohome #FediverseScan #Github #csv #データ

Dr. Juande Santander-Velajuandesant@mathstodon.xyz
2025-10-01

If you’re interested in baseball in particular, or in any other sport where some data sources are available for which a variable needs to be followed over time over the central 50% quartile, this post covers both data reading, transformation, and plotting.

#DataManipulation #DataVisualization #CSV #Python #pandas #matplotlib fosstodon.org/@drdrang/1152962

GripNewsGripNews
2025-09-28

🌗 我如何意外創造了史上最快的 CSV 解析器
➤ 從分支程式設計的理論,到透過 SIMD 締造驚人的 CSV 解析效能
sanixdk.xyz/blogs/how-i-accide
本文作者分享了他如何從一個關於「無分支程式設計」的實驗性文章,進而挑戰最常出現分支判斷的 CSV 檔案解析問題。他解釋了傳統解析器因 CPU 分支預測失敗和記憶體存取瓶頸而效率低落的原因,並介紹了 SIMD 和 AVX-512 等技術如何透過一次處理多個資料來大幅提升效能。最終,他利用這些技術開發出一個驚人快速的 CSV 解析器,並將其包裝成 Node.js 套件。
+ 這篇文章深入淺出地解釋了 CPU 架構如何影響程式效能,尤其是 SIMD 的威力令人印象深刻!
+ 太棒了!我一直覺得 CSV 解析很慢,現在終於知道為什麼,也看到了解決方案。
架構 語言 解析 .js

2025-09-24

Podobné články už možná nemá smysl psát. Jednak snad každý, kdo pracuje s počítačem, dovede trochu anglicky, aby si návod přečetl na oficiálním webu, jednak pokud potřebujete rychlé řešení, vyplivne vám jej AI.

💡 Na druhou stranu, pokud nechcete jen tupě něco opisovat, mám pro vás inspiraci na nástroj jq, kterým si snadno zobrazíte JSON data na příkazovém řádku a třeba je překonvertujete do CSV a Excelu.

maxiorel.cz/znate-jq-skvele-ud

#json #terminal #prikazovyradek #csv

Bruno Amaralbrunoamaral
2025-09-24

XML To CSV Converter
Convert XML to CSV and XML to Excel Spreadsheet
convertcsv.com/xml-to-csv.htm

2025-09-24

[Перевод] Самый быстрый способ загрузить 32 000 строк в PostgreSQL с помощью Python

Команда Python for Devs подготовила перевод статьи о том, как найти самый быстрый способ загрузки данных в PostgreSQL с помощью Python. Автор пошагово сравнил разные методы — от построчных вставок до COPY с потоковой генерацией CSV — и показал, как ускорить процесс более чем в 250 раз при нулевом потреблении памяти.

habr.com/ru/articles/948854/

#postgresql #python #psycopg #загрузка_данных #импорт_данных #csv #copy #execute_batch #execute_values #оптимизация_производительности

2025-09-23

For the love of whatever is still holy in this world, please never embed tables in the columns of a #CSV file. Not only does it make it harder to read, but some CSV readers have a maximum field length that will bite you in the ass after it's too late to switch. And then I will be stuck trying to decide if I should break backwards compatibility with your stupid format or if I need to write my own parser. Just use json, you can't read nested CSV in excel anyway

#python cause y'all need to hear it

Rod2ik 🇪🇺 🇨🇵 🇪🇸 🇺🇦 🇨🇦 🇩🇰 🇬🇱 ☮🕊️rod2ik.bsky.social@bsky.brid.gy
2025-09-22

#Jimmy - Pour exporter toutes vos notes en #Markdown Des dizaines d'applications supportées ainsi que les principaux formats : #DOCX, #ODT, #HTML, #EPUB, #CSV et même les #Jupyter #Notebooks korben.info/jimmy-conver...

Jimmy - Pour exporter toutes v...

Rod2ik 🇪🇺 🇨🇵 🇪🇸 🇺🇦 🇨🇦 🇩🇰 🇬🇱☮🕊️rod2ik
2025-09-22

- Pour exporter toutes vos notes en

Des dizaines d'applications supportées ainsi que les principaux formats : , , , , et même les

korben.info/jimmy-convertisseu

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst