Are you going to KubeCon EU? Come find me and get one of these awesome (limited in number) Duckling stickers. #duckling #kubeconeu #docling https://twp.ai/E6HlHZ
Are you going to KubeCon EU? Come find me and get one of these awesome (limited in number) Duckling stickers. #duckling #kubeconeu #docling https://twp.ai/E6HlHZ
Contrary to what I'm reading, #Claude is far worse than I remember. It often doesn't fully understand simple questions and needs nudging - and as in this case, can't solve even trivial problems which only require careful reading of the log.
It took #Kimi 10secs to solve a problem with an api call to a #docling
server crashing when accessed over its api. I used Kimi in the browser. #ClaudeCode Sonnet 4.6 had full access to the actual server and couldn't solve the issue after 30 mins of repeated attempts.
```
The root cause is clear: **Docling workers are crashing due to missing EasyOCR dependency**, causing the async jobs to fail. OpenWebUI polls for results, gets a 404 because the job died, and reports "Task result not found."
**Fix the Docling installation:**
```bash
# Stop the service
sudo launchctl unload /Library/LaunchDaemons/com.docling-serve.plist
# Install EasyOCR in the correct uv tool environment
uv tool install --upgrade docling-serve --with easyocr
# Or if that doesn't work, run the specific Python interpreter:
/Users/urisharf/.local/share/uv/tools/docling-serve/bin/python -m pip install easyocr
# Reload
sudo launchctl load /Library/LaunchDaemons/com.docling-serve.plist
```
**Alternative (if you don't need OCR):** Configure Docling to use a different OCR engine or disable it. Check Docling's env vars for `DOCLING_OCR_ENGINE` or similar.
**Verify the fix:**
```bash
curl -X POST http://mbp.lan:5001/v1/convert/source \
-H "Content-Type: application/json" \
-d '{"source": "https://arxiv.org/pdf/2206.01062.pdf"}'
```
The duplicate `libavdevice` dylibs warning is harmless noise from Gradio/CV2 conflicts, but the EasyOCR error is fatal.
```
Build Agent-Ready RAG Systems in Java with Quarkus and Docling https://www.the-main-thread.com/p/enterprise-rag-quarkus-docling-pgvector-tutorial
@karstenpe ich habe jetzt zwei Varianten der Notizbücher vom Remarkable lokal gespeichert: 1x als PDF mit Bitmap drin und 1x PDF mit Vektoren.
Welches CLI-Tool würdest du mir für #OCR empfehlen? #Tesseract?
Bei der Gelegenheit werde ich auch mal #Docling mit OCR-Option ausprobieren, das hat aber glaube ich keine eigene Engine.
Geht das auch mit #Ollama direkt aus PDF und einem lokalen LLM? Hat jemand Ideen?
Barrierefrei für Mensch und Maschine – Kleine Helferlein zur Konvertierung von Office-Dokumenten in maschinenlesbare Formate
Die „Währung“ in der dokumentiertes Wissen in Organisationen transferiert wird sind Office-Dateiformate wie pptx, docx, xlsx und PDF. Leider sind das Formate, die für Maschinenlesbarkeit denkbar ungünstig sind. Oft sind sie zusätzlich so groß (z.B. durch in Powerpoint-Präsentationen eingefügte Grafiken), dass sie schnell die Limits von KI-Chatbots wie Copilot, ChatGPT und Claude […]
Tomorrow (Feb 4) at #CfgMgmtCamp in Ghent, Ming and I will run a workshop on #Docling at 14:00 in B.1.031 - https://cfp.cfgmgmtcamp.org/ghent2026/talk/9CV7CY/
Join us! @cfgmgmtcamp
The slides and recording for my presentation on Get your docs in a row with #Docling are now available - https://fosdem.org/2026/schedule/event/DVRV8S-get_your_docs_in_a_row_with_docling/
Thanks to @fosdem organizers and volunteers for another amazing event. My 11th in-person #FOSDEM (13 including virtual ones).
The docling-testcontainers module provides a ready-to-use Testcontainers integration for running a Docling Serve instance, wrapping the official container image and exposing a simple Java API.
https://testcontainers.com/modules/docling/
Как затащить AI в Java/Kotlin проект
Мир Enterprise-разработки на Java/Kotlin и мир нейронных сетей кажутся параллельными вселенными. С одной стороны - статическая типизация, многопоточность, Spring-контейнеры, а с другой - Python-скрипты, тензорные операции и эксперименты в Jupyter Notebook. Между ними - пропасть, через которую многие команды не решаются перешагнуть. Однако необходимость строить этот мост возникает всё чаще. Заказчик хочет «искусственный интеллект» в новом фиче, аналитики мечтают о реализации чат-бота с преферансом и барышнями, а менеджеры слышали, что конкуренты уже всё автоматизировали. Как же совместить надежность и структуру JVM-проекта с гибкостью и мощью AI? В этой статье постараемся разобраться какие инструменты для этого есть на данный момент и как с ними работать.
https://habr.com/ru/articles/984544/
#AI #ИИ #Java #Kotlin #LLM #State_Graph #Vector_DB #Docling #Embeddings
Just published a new deep-dive on building enterprise-grade RAG in Java.
In this tutorial, we combine:
• Quarkus
• Docling (layout-aware PDF parsing)
• pgvector + PostgreSQL
• Local LLMs via Ollama
• And a simple guardrail layer
This is the most complete RAG pipeline I’ve built so far, and it’s fully open for you to copy, run, and adapt.
Read here:
https://www.the-main-thread.com/p/enterprise-rag-quarkus-docling-pgvector-tutorial
In this new article, I describe how to build a Retrieval Augmented Generation system in Java using Spring AI and Docling for advanced, privacy-focused document processing. You'll learn how to design an Ingestion Pipeline powered by Docling for loading, converting, and chunking any type of document for your RAG use cases.
Docling is an open-source, privacy-focused solution for advanced document parsing. Using the brand-new Docling Java SDK and Arconia, I'll show you how to integrate Docling into your Spring Boot applications, and prepare documents for RAG and GenAI.
#Java #SpringBoot #AI #Docling
https://www.thomasvitale.com/ai-document-processing-docling-java-arconia-spring-boot/
Docling #Java is the official Java client and tooling for #Docling — a suite that simplifies document processing and parsing across diverse formats (with advanced PDF understanding) and integrates seamlessly with #GenAI frameworks.
https://docling-project.github.io/docling-java/dev/
We've seen using #docling a lot at work lately to parse all kinds of documents in various formats. It's handy for converting them into a common JSON document.
Updating stickers on laptops... let's see how many I can tag
@matrix @ansible @InstructLab @thinkpadmuseum @trustyai @github @fedora
#ramalama #docling #vllm #llmd #pytorch #NERC #redhat #ospo #upstream #ansible #thinkpad #womeninfedora #expo2025 #kubeflow #trustyAI #cushingcenter #operationstickybusiness
Wow, #docling added support for Arabic and can handle complex documents with text that goes right to left!
The docling document pipeline:
Context is important! Vegetative electron microscopy does not exist! 😂
Another reason to use Docling for your documents. RAG is another garbage in, garbage out situation.