#TextAnalysis

Nick Byrd, Ph.D.ByrdNick@nerdculture.de
2025-04-16

Wow! #QualiService could be a great resource!

It wasn't obvious to me how to find the transcripts for these doctor-patient interaction data from 4 countries, but if such transcripts are accessible, that's GREAT!

qualiservice.org/en/qsearch.ht

#medicine #openData #cogSci #TextAnalysis

A Qualiservice search for "diagnosis" returned a list of qualitative datasets from video recordings of doctor-patient interactions in four countries (China, Germany, Netherlands, and Turkey).
2025-03-26

🇪🇺 Want to analyze text from the EU public consultations? EU public consultations are a way in which the EU invites the broader public to publicly comment on upcoming legislation.

📦 :python: I just published a first version of a Python package {eu-consultations} to scrape and extract text from the EU website:
github.com/marioangst/eu_consu

- download consultation data as displayed on the EU's frontend into a validated form
- download associated files (this is the hard part about analysing this data - lots of feedback is in .docx and .pdf files)
- extract text from the files using docling and attach to feedback

You get all data in validated form and possibly stored in huge (sorry for that) JSON files ;).

This package is part of an analysis project on feedback the EU has received via the public consultation process on digital policy we plan to present later this year, but I thought let's make some of the tools we use open source way earlier already.

#python #textanalysis #policyanalysis #CompSocSci

AnsibytecodeAnsibytecode
2025-03-26

Unlocking Insights: Text Analytics in NLP with Azure - Ansi ByteCode LLP

Discover how Text Analytics in NLP with Azure. Learn tokenization, sentiment analysis, entity recognition to analyze text efficiently. Please visit:- ansibytecode.com/text-analytic

George Macgregorg3om4c@code4lib.social
2025-03-07

Useful contribution to discussions in this area, for sure! The results highlight "whether an automated approach that would still require micromanaging and adjusting several variables by the human researcher would, in fact, be more efficient an approach compared to the same tasks performed manually by human labour"

Out of Context! Managing the Limitations of Context Windows in #ChatGPT-4o Text Analyses doi.org/10.46298/jdmdh.15090 #DigitalHumanities #TextAnalysis #LLM #ArtificialIntelligence #GLAMR

Brenna Edwardsbrenna@digipres.club
2024-11-15

Pro: text scraping PDFs worked pretty well!

Con: I still have to go in and fix so much before I can start trying to consolidate things.

Anyone have any tips?

Is converting the TXT files to CSV and playing with it in Open Refine my best option? 🤔

Trying to get a list of writers for a TV series by season if that's helpful context!

#OpenRefine #digitalArchives #pdfScraping #textAnalysis

(X-post with 🦋)

khushnumakhushnuma
2024-10-10

Mastering these core NLP techniques is crucial for any data scientist dealing with text data. From tokenization to language modeling, each method serves a unique purpose in processing, analyzing, and extracting valuable insights from textual information.

read more: blogulr.com/khushnuma7861/topn

Nick Byrd, Ph.D.ByrdNick@nerdculture.de
2024-09-25

Like we found in “Your Health vs. My Liberty” (doi.org/10.1016/j.cognition.20) Yael Rozenblum et al. found that compliance with #publicHealth guidance correlated with indicators of the perceived threat of a viral pandemic.

Also, relying on #misinformation correlated with reliance on simple (vs. complex) #reasoning.

The free paper: doi.org/10.1002/tea.21975

#medicine #health #education #psychology #epistemology #logic #textAnalysis

Measures of perceived threat (“motivation”) and compliance (“stance”).How perceived threat (“motivation”) predicted compliance (“stance”).Categorization of simple and complex reasoning (with some examples).How reliance on misinformation correlated with complexity of reasoning and education.
Daniela SchneiderSchnDa@fedihum.org
2024-09-09

Have you ever wanted to use a #LLM as one step in a workflow?

We integrated #GPT into the open-source analysis platform #useGalaxy, where you can link GPT to several thousand other tools, add more attachments for analysis and make your research reproducible.

galaxyproject.org/news/2024-09

In our example, we uploaded an audio file and used #Whisper to convert it into text, cut out the moderation, and prompted chatGPT to translate it into German.

#DH #textanalysis #tools
@galaxyfreiburg

2024-08-21

📚🇮🇹 New working paper: "Evaluating Embedding Models for Clustering Italian Political News"

This study compares embedding models for unsupervised clustering of Italian political news shared on Facebook before the 2018 and 2022 elections, aiming to advance NLP methods for political text analysis in non-English languages.

Paper: osf.io/preprints/osf/2j9ed

Code & data: github.com/fabiogiglietto/Sema

Feedback welcome!

#NLP #PoliticalScience #TextAnalysis #MachineLearning

Jason Robisonjrrobison1
2024-08-18

Pycpidr 0.3.0 introduces:
- Dependency-based Idea Density (DEPID)
- DEPID-R
- Custom sentence and token filters for DEPID

github.com/jrrobison1/pycpidr

Paul HouleUP8
2024-08-17

🎯 Potential terrorists can be identified from social media posts, new research shows

phys.org/news/2024-08-potentia

Jason Robisonjrrobison1
2024-08-15

Just launched: pycpidr 🎉
github.com/jrrobison1/pycpidr

Python library to determine the propositional idea density of an English text automatically.

Idea density is a measure of the amount of information conveyed relative to the number of words used. This metric has applications in various fields, including linguistics, cognitive science, and healthcare research.

Elias Dabbas :verified:elias@seocommunity.social
2024-08-15

Word co-occurrence matrix/heatmap

How to compute and visualize the correlation between terms that occur together in a list of documents*

*documents: keywords, page titles, product names/descriptions, social media posts, etc.

bit.ly/3Z4tiTx

#DataVisualization #textanalysis #DataScience #Python

Steven P. Sanderson II, MPHstevensanderson@mstdn.social
2024-07-26

Hi everyone! I recently tackled a common data task using R: counting the occurrences of a specific phrase in a text file. It's a great way to practice text analysis and get familiar with R's powerful tools.

See the attached.

Happy coding! 💻✨

#RStats #DataScience #TextAnalysis #tidyverse #Coding #LearningTogether #R #RProgramming #Programming #Coding

Harald KlinkeHxxxKxxx@det.social
2024-07-17

The Digital Humanities Team at the University of Vienna and the Ottoman Nature in Travelogues (ONiT) project are hosting a #hackathon focused on analyzing texts, images, and multimodal sources.

Thursday, November 14, 9:00 CET to Friday, November 15, 15:00 CET
dh.univie.ac.at/hackathon/
#DigitalHumanities #ComputationalHumanities #TextAnalysis #ImageAnalysis

Marshall A. Taylormtaylor_soc@sciences.social
2024-07-08

It was also a methodologically fun paper, combining digitized archival text, Census & survey data, NLP, and panel models.

Email or dm me for a copy! #sociology #textanalysis #rstats

3/3

Paul HouleUP8
2024-06-27

😓 An NLP-Based System for Detecting Depression Levels through User Comments on Twitter (X)

mdpi.com/2227-7390/12/13/1926

2024-06-04

📣 Attention Linguistics & Digital Humanities students! 🎓📚
Join @janispagel and me for the »Prompting, Evaluation, Interpretation: An Introduction to LLMs in Text Analysis« course at the upcoming Deep Learning for Language Analysis Summer School in Cologne: ml-school.uni-koeln.de! 📝🔍
🗓️ Don't miss out – registration is open until June 16th! 🙌
#LLMs #TextAnalysis #NLP #AI #Linguistics #DigitalHumanities #CRETA

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst