#digiclass

2025-06-18

That's it, for my live-tooting of this #digiclass conference: I have to leave the conference now. My take-home from this conf. is that not only computational linguistic methods, but also quantitative/statistic methods based on machine learning (e.g. for paleography and manuscript studies in general) have gone so far that they have already changed the landscape dramatically
#digitalhumanities

mind-blowing logo
2025-06-18

#digiclass M. Romanello quoting a recent study on this of his team: doi.org/10.18653/v1/2025.latec

2025-06-18

#digiclass M. Romanello: perceived OCR quality varies vastly; philologists are very sensitive to the presence of 2% OCR errors

2025-06-18

#digiclass M. Romanello: reaching 98% accuracy for OCR now even with mixed (e.g. Engl. / Greek) text

2025-06-18

#digiclass M. Romanello: workflow/pipeline

slide on pipeline
2025-06-18

#digiclass M. Romanello: "Ajax Multi-Commentary" multi.ajmc.ch/ (proof of concept); "The AjMC platform enables the non-linear, comparative reading of commentaries on Sophocles' Ajax, and provides unified access to OCR transcriptions and facsimile images of seven public domain commentaries"

2025-06-18

#digiclass P. Stokes: github repo is htr-united.github.io/

2025-06-18

#digiclass (personal comment: go HTR-United: documenting the stuff is important!)

2025-06-18

#digiclass P. Stokes: HTR-United marketplace.sshopencloud.eu/to "is a catalog of metadata on training datasets (ground truth datasets) available for the creation of transcription or segmentation models"

2025-06-18

#digiclass P. Stokes: there's no such thing as a standard for transcription (transcriptions must follow different conventions for different purposes)

2025-06-18

#digiclass P. Stokes: adirect, universal export from ATR to TEI is not possible; there are XSLTs availab. to transform from ALTO to TEI; for specific cases with complex geometry of lines on the page, you have to write your own conversion script (Transkribus exports to TEI with <p> tags)

2025-06-18

#digiclass P. Stokes on issues with standards: ALTO XML and PAGE XML have different versions; see Chagué, Clérice, Romary, hal-03398740

2025-06-18

I'll be live-tooting this morning's talks at conf. "Classical Texts in Digital Media II - Digital Methods for Editing and Studying Ancient Texts" (unive.it/data/33113/1/103387). To follow, check the unlisted toots below, in this thread
#digiclass #digitalhumanities

2025-06-17

F. Mambrini: "Annotating Ancient Greek: A 15-Year Journey through Treebanks, Tragedies, and Standards"
('Unlisted' thread below)
#digiclass

2025-06-17

#digiclass C. Palladino on approaches to NER for the ancient world (see thread 'unlisted' below)

2025-06-17

#digiclass C. Palladino on the issues w/ ancient named entities.

slide
2025-06-17

Chiara PALLADINO (Furman University): "To
Name or Not to Name(d Entity)" // questions with NER, in the time of AI; from rule-based methods to Neural Networks, to Language Models. Starting from the definition of 'name' (and 'named entity': individuals, locations, numbers, dats, events, citations, languages, currency, relations etc.). 'Named entitty' can be 'the daughter of Augustus" (so, differs from 'name')

Conf. detail: unive.it/data/33113/1/103387
#digiclass

2025-06-17

M. Venuti jokingly calls for a "San Servolo Manifesto" (the conf. is in San Servolo, Venice unive.it/data/33113/1/103387): university presses must be involved as long-term partners for #sustainability
#digiclass

2025-06-17

Andreas P. Antonopoulos (FragTrag) and Martina Venuti (MQDQ) say that their projects are not currently funded; discussing funding and long-term sustainability issues

Conf. detail: unive.it/data/33113/1/103387
#digiclass

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst