#ProgramOfThought

Hacker Newsh4ckernews
2025-11-30
Victoria Stuart 🇨🇦 🏳️‍⚧️persagen
2023-09-07

Addendum 11

Making Large Language Models Better Reasoners w. Alignment
arxiv.org/abs/2309.02144

* reasoning: cognitive process; evidence-based conclusions
* fine-tuning LLM w. chain of thought (COT) reasoning sig. enhances reasoning
* h/e freq. assign higher scores to subpar COT
* Alignment Fine-Tuning; 3 steps: fine-tuning; multiple COT responses, cat. correct/incorrect; calibrating scores w. a constraint alignment loss

Title: Making Large Language Models Better Reasoners with Alignment

Figure 1: Perplexity of different answers given by the vanilla fine-tuning (VFT) LLM, where LLM
assigns a lower perplexity to the incorrect candidate answer compared to the correct candidate answer.

Source: https://arxiv.org/abs/2309.02144
Victoria Stuart 🇨🇦 🏳️‍⚧️persagen
2023-09-03

Addendum 10

When Do Program-of-Thoughts Work for Reasoning?
arxiv.org/abs/2308.15452
github.com/zjunlp/EasyInstruct

* reasoning capabilities of large language models pivotal in embodied AI
* program-of-thought prompting for LLM uses programming language to tackle complex reasoning
* e.g. mathematical reasoning; code data filtering
* specific impact of code data on improvement of reasoning capabilities underexplored

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst