#TesseracT

Admiral Patrickptz@dubvee.org
2025-11-25

Admins, I'm considering changing the default behavior. Do you keep Tesseract locked to your instance intentionally or because you didn't configure it otherwise?

dubvee.org/post/4733362

2025-11-21

[Feature Request] Hide hashtags

startrek.website/post/31954156

Michael Robertsvivtek@indieweb.social
2025-11-14

Hey, Fedi, what's the best way under Linux to OCR a scanned PDF and put the resulting text into the PDF? I haven't found any particularly convincing recipes yet. (I mean, Tesseract for the OCR part, I know that much - but what's the best way to get the text into the PDF for searchability and text selection? Ideally without disturbing any annotations I've already made.)

#pdf #linux #ocr #tesseract #document_processing

2025-11-11

Как мы ускорили работу с исполнительной документацией с помощью ИИ

Привет, Хабр! Меня зовут Всеволод Зайковский, я заместитель руководителя проекта в «Газпром ЦПС». Есть рутинные задачи, которые отнимают много времени и трудовых ресурсов. В проектах, с которыми работала компания, такой задачей была каталогизация исполнительной документации. Кто не знает, что это такое, тому очень в жизни повезло исполнительная документация – это документы, которые подтверждают фактическое выполнение работ на стройке. Вырыли траншею? Составили акт. Закопали траншею? Составили акт – и так далее. В конце среднего размера стройки мы получаем десятки тысяч актов, протоколов и схем, которые нужно вручную разобрать, хорошо отсканировать и вручную разнести по нужным папка. А затем поиск нужного документа все равно превращается в монотонный квест на несколько часов.

habr.com/ru/companies/gazpromc

#ИИ #OCR #компьютерное_зрение #Tesseract #YOLO #Python #Автоматизация #документооборот #ML #строительство

Peter Vágnerpvagner@fedi.ml
2025-11-09

I have just found a nice document scanning app for android that can do automatic edge detection, cropping, multipage scanning, OCR, PDF export and more.
It's called #makeacopy and it's using #tesseract engine to perform the OCR directly on the device with no internet connectivity requirement at all.
The app has almost full #a11y support for screen reader users in the sense that all the controls are clearly labelled and it's easy to navigate.
I can't resist and I have asked the developer if it would be doable to add a screen reader compatible notifications making the automatic edge detection somehow accessible as well.
Now I'd appreciate comments from low vision screen reader users, mobility trainers, people assisting other blind people or others who might be able to tell if my idea is viable and how much you like it?
Here is link to the github issue I have started: github.com/egdels/makeacopy/is…

Thanks for looking into it.

Oliver Ammannoa@swiss.social
2025-11-09

Is there a viable #OCR solution for #ancientGreek text in 16th century prints? Ideas? Experiences?

#rarebooks #greek #tesseract #transkribus #16thcentury

Ancient Greek text in a 16th century print
brick.newsbricknews
2025-11-05

LEGO tesseract folds emotion into a small, breathtaking cosmos
Fan designer KoalaBrick reached 10,000 supporters and moved the model into LEGO review. The build uses 4,650 pieces to recreate the film's tesseract, minifigures, Endurance, and Gargantua.

Read more in: brick.news/blog/movie-interste

Movie Interstellar Tesseract Reaches 10K Votes for LEGO® Review
Rock and Blogrockandblog
2025-11-03

🔥 Emperor y Delalma se unen al cartel del Z! Live Rock Fest 2026.
Una reunión histórica y el regreso más esperado del metal en castellano, del 11 al 13 de junio en Zamora

rockandblog.net/emperor-y-dela

2025-10-31

Learn how to extract text from screenshots and images with spectacle-ocr utility in Linux. Go from image to text in one step!

Full details here: ostechnix.com/extract-text-fro

#Tesseract #OCR #Spectacle-ocr #Spectacle #KDE #Linux #TextExtraction #Opensource

Disorientation - Behind The Scenes,

As usual, posting something about the making of last week's shot.
At the top and bottom, you see two initial trials of the setting (but there were MANY of those done on various days). It was a trial and error thing to see how would it look. If you recall from last week's post, the figure was floating in the air above the match. In the initial stages I was thinking of placing the match at the center of the figure as seen here. You might also notice the colors of the figure are different than the image posted last week, and of course, that was done in the post-processing phase.
So in the center, this is the final setting. Maybe it's not clear here, but there is an acrylic sheet that the figure was placed upon. You can probably see the little canisters, those were the "legs" to keep the sheet up. I have to say though, the hardest part of this shot is the figure, which I wouldn't be exaggerating if I say that I've spent months trying to put together. In the beginning I was thinking of gluing matchsticks together, and I failed. Then I thought of using larger sticks (~20cm long) and failed again. Then I thought about magnets! Then… I failed. It wasn't till the discovery of tiny magnets (remnants from my brother's set) when I was finally able to make this figure despite the skew because of the lengths.
You might as well notice how the speedlite here is covered with some tape. This was to focus the light (like a barn door modifier). Even though I do have a snoot and a honeycomb modifiers for this job, but this was the quick solution. With a snoot, the balance between the distance of the speedlite from the set and the power of the speedlite (and the spread of the light) didn't quite work well, so blocking the light by tape made it easier for me to control the power and the spread of light altogether, on low power.

#behindthescenes #set #setting #project #4dcube #tesseract #hypercube #camera #matchstick #goodmorning
2025-10-30

Trying to make #tesseract (that garbage OCR library, at least regarding Japanese text) recognise furigana in a fairly decent scan (images magnified!). And now, everyone together: youtube.com/watch?v=hspNaoxzNbs #Japanese #OCR

When Tesseract is asked to detect (and, as a second API call, remove) the furigana, we see that lines 2, 3, and 4 (of five total lines) are butchered. Several relevant non-furigana characters are missing, and the word in line 4 has disappeared altogether.An excerpt from a Japanese textbook. Visible are the words 留学します, 優勝します, 飼います, なくします and 考えます, together with the (typical for textbooks) augmentation with furigana.
JBrickelt963 ✊(φ)🚩JBrickelt963@framapiaf.org
2025-10-24

Toujours dans mon exploration de #Linux #LinuxMint je découvre la vraie puissance de #OCR #Text avec #gImageReader et #Paperwork (#Openpaper) :

openpaper.work/fr/ très facile à faire tourner sur Linux sans passer par #Docker. Les 2 sont aussi en .exe (relativement récent sur W* ; peu de patience sinon).

Je connaissais déjà #Tesseract mais avec 1 interface graphique et la #Research OCR dans 1 image c'est hyper puissant. Ça marche qu'avec des fonts non manuscrites mais qui sais un jour ...

تَـوَهـانْ (disorientation)。

In between the continuous clashes with our surroundings for survival, and the self-doubt that hovers heavily in our hearts because of all the turmoil with questions of whether or not we've been wrong all this time about our choices in this life, there comes moments of such discrepancies that I might as well describe as being moments from another dimension where we look at ourselves in the eyes and ask a simple question but without any answer: Where are we?
In fact, this is probably the question at the surface, but the deeper question could be: "Where do I belong?" to begin with. It is kind of a situation where you see yourself here, but also there, but yet you don't quite belong to here because you don't get along with much of what is going on here, and yet you don't feel belonging to there because, again, you don't agree or get along with what "there" represents.

It is such a heavy load that casts its shadow not on the heart alone, but on the whole body; you find yourself sluggish, hardly moving, hardly desiring, hardly achieving, hardly completing the simplest of tasks, and every passing day becomes a story of survival on its own, worrying and wondering how the day will end, instead of living each day for what it is. Meanwhile, the mind thrives to seek the answer to that question: Where do I belong?
For some, the question seems not so important. All they see in this life is to survive. Yet, for those with dreams, and sense of purpose, the question is quite crucial to decide where and what path to sit foot on. It's not about success as much as it is about putting an agitated soul to rest and not to let it roam the realm of the living, like a ghost under purgatory, and until the answer for that question comes, one has no choice, but to undergo the torture of that transcendent dimension which doesn't belong to anywhere…

#disorientation #lost #identity #project #match #matchstick #4d #hypercube #tesseract #goodmorning
Terence EdenEdent
2025-10-18

I'm using 5 to do the locally.

It is *mostly* pretty good - because the default font is nice and clear.

But it does have some problems with spaces. Ideally I want this to be fully automated, I don't want to manually correct stuff.

Any recommendations for a *local* OCR which runs on

Nicely spaced text in an image.In a text editor, some of the words run together.
2025-10-15

Any Lemmy web clients that show notifications from multiple accounts at once?

lemmy.dbzer0.com/post/55549057

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst