Lmst

Making sure y'all saw this CfP for the Special Issue on Digital Communication in Social Movements.

Timeline: 8.3.25 abstract; 7.8.25 full paper; end of year publication.

#AI #TextasData #ImageAsData #SocialMovement #ScienceRocks

https://journal.computationalcommunication.org/announcement/view/179

Neu: Bei Pollux Political Corpora (PoliCorp) können umfangreiche politische Textsammlungen (Korpora) strukturiert durchsucht und analysiert werden.
Mehr Infos gibt es hier: https://pollux-fid.de/news/2025/79

Die Demoversion von PoliCorp kann hier getestet werden: https://lnkd.in/eCb3cgx5

#textasdata #Germaparl #fürdiepolitikwissenschaft

Thinking about using #LLM for your #textasdata research? …Think twice! And join our panel tomorrow at #aoir2024

In March 2025, the RUNIP project @ruhr-uni-bochum.de will host the conference “Words in Numbers – Data-Driven Approaches to Texts in the Humanities and Social Sciences.” Keynote speakers include @jerielizabeth and Jo Guldi 🙌. The call for papers is now open, inviting especially early career researchers to submit proposals for short talks or posters. Check it out and join us! https://runip-projekt.ruhr-uni-bochum.de/words_in_numbers.html (English CfP is in the linked PDF at the end) #TextAsData #DigitalHumanities

“Text as Data”

From March 31 to April 5, 2025, a week-long
DH Spring School
in Potsdam, Germany will focus on text-based digital humanities for students and early-career scholars. Learn computational methods like topic modeling, stylometry, and network analysis. Apply by Nov 30, 2024. Participation is free!

https://www.uni-potsdam.de/en/digital-humanities/activities/dh-potsdam-spring-school-2025-texts-as-data

#DigitalHumanities #TextAsData #DHSpringSchool

The Language of (Non)Replicable Social Science #MetaScience #TextAsData
https://journals.sagepub.com/doi/full/10.1177/09567976241254037 There are systematic differences in the discussion of findings that were and were not replicable.
I was thinking that methods competence could be a confounder 1/

New post just out: running LLMs from my own laptop to rate how unhinged each of Medvedev's Telegram posts really is.

Actually, just testing the feasibility of using locally-deployed LLMs as coders of Russian-language text https://tadadit.xyz/posts/2024-04-09-inter-coder-reliability-unhinged-medvedev/ #Russia #TextAsData #LLM

Est-ce que vous connaissez des alternatives graphiques à #orgmode sur #emacs ?
J'ai beau être un #geek, j'ai bien les interfaces graphiques et la souris ^^
#textAsData

Papers about Causal Inference and Language
"A collection of papers and codebases about influence, causality, and language."

#textasdata

https://github.com/causaltext/causal-text-papers

Stumbled over this pretty cool living bibliography on causal textanalysis. Might be useful to some. #textasdata #causal #nlp #sciencerocks https://github.com/causaltext/causal-text-papers

+++ NEW ARTICLE +++

Today I want to share some insights from my most recent publication – Text as Data: Variable Extraction via LSTM Networks!

#sociology #textAsData #CSS #AS @sociology

https://www.hendrik-erz.de/post/new-paper-text-as-data-variable-extraction-via-lstm-networks

Should we trust web-scraped data? https://arxiv.org/abs/2308.02231 When it comes to relationship btw target population and sampling, paper says "yes" if one takes care of volatility and personalization of web data plus incomplete indexing #TextAsData

Figure describes the sampling process: definition of the target population comes first (in the figure on the left of the process. Indexing leads to second step, which is the compilation of th sampling frame. Fetching leads to the third step, which is drawing a sample.

Text from caption: Web scraping entails two steps, indexing and fetching. In indexing, the target population is systematically
registered. Indexing yields the frame in terms of a register of all units in the population, together with the
URLs pointing to each unit. Fetching automatically visits each URL listed in the frame and downloads the
resource at which it points, typically an HTML document.

Do you expect higher-quality #dialogue in small group discussions or whole class discussions?

It might depend on the metric:
- small groups fostered more invitation for peers to weigh in (d = 0.78, p < 0.001)
- whole classes generated more justifications of one's viewpoint (d = 0.69, p < 0.001)

Loads more insight from Herculean corpus analyses involving over 4000 students from 5 countries: https://doi.org/10.1016/j.linged.2023.101223

#edu #teaching #argumentation #P4C #corpusLinguistics #textAsData #DevPsych

Anyone have any particularly thorough walkthroughs re: looking for shared text across documents using R?

I am using the textreuse and text.alignment packages and making progress but I have been spinning my wheels trying to figure out exactly what I am looking at, as well as how the alignment score is calculated. I am interested in *both* what matches and what does not, often in large chunks (sentences at a time).

Alternately, anyone have thoughts on how to artfully present text alignment results, both for analysis and for sharing with others?

#Rstats #TextAsData #TextReuse #DH

Thread:
🚨Software Publication Alert🚨
This one is for you, text-analysis folks:

I am very happy to share the publication of my text annotation R-package "handcodeR" 🥳, which is now available on CRAN. 🧵1/7

CRAN.R-project.org/package=handcoge=handcodeR

#CRAN #rstats #TextAsData #TextAnnotation

Reminder: The #TextAsData conference deadline is now August 11th.

Details and link here: https://colliderbias.net/@commonsupport/110800066721843173

Gerade findet an der @elibbremen die 12. Sitzung des wissenschaftlichen Beirats von @fidpol statt. Wir sprechen über #neues #textasdata #openaccess #medienkompetenz #dataliteracy #computanionalsocialscience und mehr und freuen uns über den Austausch und die Expertise

Liberals and conservative use similar moral words, but they attach different meanings to those words

Maybe seems obvious, but suggests that moral politics is a competition over *meaning* and not promoting specific values

It also has implications for how we think about and use (or don't use) dictionary methods for understanding morality in text and speech.

https://www.cambridge.org/core/journals/british-journal-of-political-science/article/abs/lexical-ambiguity-in-political-rhetoric-why-morality-doesnt-fit-in-a-bag-of-words/BF369893D8B6B6FDF8292366157D84C1

#polisci #newresearch #newpaper #morality #socpsych #newpsychresearch @socialpsych @politicalscience #textasdata

If you are interested in how to apply #TopicModeling to continuously growing corpora, have a look at this talk by @rieger he gave last year in our lecture series:

https://zpid.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=03addb74-2f86-4ed0-b877-af9c00a2959d

He shows the advantages of modeling growing text corpora using RollingLDA and presents some example applications, e.g., the uncertainty perception and inflation perception indicators, PsychTopics as well as various change detection and monitoring scenarios.

#TextAsData

The title slide for the presentation "Keep rollin'! The abilities for monitoring growing corpora using RollingLDA"

The Summer Institute in #ComputationalSocialScience will be held in person at the WZB #Berlin #SocialScience Center, July 3 to July 13, 2023
https://sicss.io/2023/berlin/

The focus of the SICSS-Berlin is on #TextAsData, #WebsiteSscraping, #MachineLearning, and #ethics.

Application deadline: March 31, 2023

If I were a #PhDStudent or #PostDoc I would apply

#TextAsData

Client Info