Lmst

#AI reasoning models may seem to reason reflectively when they say things like, "Let me rethink that".

But do these "reflective" phrases predict better reasoning performance?

Not in #Deepseek R1 Zero: https://doi.org/10.48550/arXiv.2503.20783

#cogSci #decisionScience #processTracing #psychology

Some examples of the "aha" moments in allegedly "self-reflection" output from AI reasoning models.

"An additional important question is whether self-reflection behaviors are associated with improved model performance after RL training. To investigate this, we host DeepSeek-R1-Zero and analyze its responses of the same questions from MATH dataset. While self-reflection behaviors occur more frequently in R1-Zero, we observe these behaviors are not necessarily imply higher accuracy. Detailed analysis can be found in App. D."

Self-reflection does not necessarily imply higher accuracy. To investigate whether self-reflection behaviors are associated with model performance during the inference (acknowledging that self-reflection may improve exploration during training—a potential positive effect outside this section’s scope), we analyze questions that elicit at least one response with self-reflection from DeepSeek-R1-Zero across eight trials. For each question, we sample 100 responses and divide them into two groups: those with self-reflection and those without. We then compute the accuracy difference between these two groups for each question. As shown in Fig. 13, the results indicate that nearly half responses with self-reflection do not achieve higher accuracy than those without self-reflection, suggesting that self-reflection does not necessarily imply higher inference-stage accuracy for DeepSeek-R1-Zero.

What keywords or phrases count as "self-reflection"?

Here's what the paper reports: "terms like “wait” and “try again” frequently result in false positive detections. To reduce false positives, we maintain a small, highly selective keyword pool consisting of terms that are strongly indicative of self-reflection. In our experiment, the keyword pool is limited to: recheck, rethink, reassess, reevaluate, re-evaluate, reevaluation, re-examine, reexamine, reconsider, reanalyze, double-check, check again, think again, verify again, and go over the steps."

Why can economic #inequality depress the #minimumWage?

An is-ought #fallacy?

From over 135,000 people in #protests, experiments, and #processTracing studies, scientists found that people seemed to infer what people OUGHT to earn from what they DO earn.

https://doi.org/10.1037/xge0001772

Eye-tracking results and a visualization tool that "attenuated the gap between people’s minimum wage prescriptions" (N ≅ 2000).

Do people vacillate more before or after they can decide?

In this paper, most vacillations occurred before people could decide about logic or moral problems, except maybe for moral dilemmas.

Check out the #processTracing method: https://doi.org/10.1017/jdm.2024.15

#xPhi #ethics #cogSci

Figure 6. Density plots of response times of the ﬁrst, median, and last switches for each participant in 3 experiments. X-axis is time in seconds. The deliberation phase starts with the presentation of the problem in text format to participants at X = 0. The dashed vertical lines are at X = 60 seconds after which participants could report their ﬁnal decision.

Want more evidence that mathematical and verbal reflection tests could be measuring somewhat distinct psychological processes?

🤓 Stimulating right dorsolateral prefrontal cortex (DLPFC) often impacted performance on the numeric cognitive reflection tests (including a base rate neglect task), but not the verbal cognitive reflection tests (N = 48): https://doi.org/10.1016/j.heliyon.2024.e36078

#neuroscience #processTracing #decisionScience #psychology #measurement #assessment

Finding the needle in the haystack: archival research in European political science
https://link.springer.com/article/10.1057/s41304-024-00488-3 #methods
Work on #ProcessTracing primarily focused on philosophy of science, design and causal inference. This was all fine, but came at expense of focus on data collection.
It is good to see more and more articles on data collection in qualitatibe like 👆 that are concerned with practical challenges one is likely to confront

We didn’t find that thinking aloud disrupted decisions, but will it disrupt athletic performance?

Researchers had 8 trained cyclists and 8 untrained people do a baseline time trial, another time trial, and one more time trial while thinking aloud about exertion and emotion (with a cognitive test before and after each trial).

The abstract suggests no athletic or cognitive differences were detected.

Paywalled article: https://journalofsportbehavior.org/index.php/JSB/article/view/256

#sports #performance #processTracing #psychology

Can we automate transcript analysis (e.g., from think-aloud recordings, online chats, etc.)?

Huang et al. coded transcripts from med. students who made diagnoses while thinking aloud.

Eight machine learning algorithms seemed to predict most of the variance between correct and incorrect diagnoses from linguistic features of the transcripts!

The future of text analysis may be bright!

https://doi.org/10.1007/s12528-024-09404-6

#dataAnalysis #automation #ML #AI #processTracing #Medicine #decisionScience

The decision environment, think-aloud protocol, and text data

Just published:

Why Incorporate the ECHR? The Domestic Incentives of Human Rights Commitment

International Studies Quarterly
https://doi.org/10.1093/isq/sqae039

#humanrights #echr #ecthr #sweden #denmark #processtracing #IHRL

More than 20 years in the making!

In 2002, Kahneman and Frederick informally reported "the bat and ball problem".

In 2023, Meyer and Frederick report 59 studies of the problem in just 9 pages: https://doi.org/10.1016/j.cognition.2023.105380

5 initial take-aways and 2 things I like about this paper: https://byrdnick.com/archives/25863/the-bat-and-ball-problem-20-years-later

#CognitiveScience #DecisionScience #JDM #DualProcessTheory #psychology #psychometrics #economics #finance #heuristicsAndBiases #processTracing #deepScience #standardization

How do we know what participants thought when we presented our stimuli?

#ProcessTracing can reveal what people saw (e.g., eye-tracking), consciously thought (e.g., concurrent think-aloud), etc.

Combining those two methods revealed:
(1) thinking aloud didn't impact gaze or word count
(2) retrospective think-aloud left out thoughts that were mentioned concurrently
(3) retrospective think-aloud introduced thoughts unmentioned concurrently

https://doi.org/10.1007/978-3-319-14956-1_5

#PsychMethods #CogSci #xPhi

What else can #psychology's #thinkAloud studies do? Design #AI!

Recording people think out loud inspired distinctions between three different types of questions: surface, testing, and deep.

So they made a series of modules (QASA) corresponding to each type: associative selection, rationale generation, and systematic composition.

The results? QASA "outperform[ed] the state-of-the-art #InstructGPT by a big margin."

https://openreview.net/forum?id=5ud0h8OXwD

#ML #LLM #processTracing #computerScience #cogSci

How can we detect the methods people use to make decisions?

Wanying Jia and colleagues tried asking participants: "May I ask what method you took to choose the answer...?"
- Responses revealed 3 methods
- #EEG patterns differed between them

Authors conclude that this "new" method can be used to study "the interaction between the intuition-based 'fast' ...and the analysis-based 'slow' system[s] in ...decision-making"

https://doi.org/10.1016/j.compbiomed.2023.106845

#DecisionScience #CogSci #ProcessTracing

Another cool process tracing paper in the same issue as our "Tell Us What You Really Think" shows how
- attention shifts to the crucial task element just before "Aha!" moments of insight
- "explicit hints" move attention to yield insight
- LINEAR statistical models overlook this attention/insight!

https://doi.org/10.3390/jintelligence11050086

#DecisionScience #CogSci #ProcessTracing #Insight #Psychology #Statistics #DataScience #DataAnalysis #R #noxp

Now that we've published "Tell Us What You Really Think", I can share its video presentation (9.5 minutes): https://m.youtube.com/watch?v=2UyY6FC2p4A&feature=youtu.be

If you are looking for video presentations of my other papers, check out the "My Research" playlist on my YouTube Channel or look for the links on my CV on my website.

#CogSci #Psychology #DecisionScience #DualProcessTheory #ReflectiveReasoning #ProcessTracing #xPhi #Philosophy

In 3 priming experiments about politics, morality, and race, "behavior was most often guided by either deliberate cognition or else …unspeciﬁed processes" (rather than "prime-related automatic cognition").

Authors think that #ProcessDissociation is key to revealing this pattern (previews from the #OpenAccess paper in pictures): https://sociologicalscience.com/articles-v10-4-118/

#DualProcessTheory #ProcessTracing #Research #Methods #Sociology