#Reasoning

Feike 🇪🇺🇳🇱 🚫👑feike@toot.community
2025-12-12

@DemLabs why cant i only interpreted the taking control of #foreign (a) vessel(s) without some in #sovereignty based #legal #reasoning, as a act of war against #venezuela? Venezuela is btw a neighboring country to the Kingdom of the #Netherlands, in which I live, so, ..... Not great.... 🤮

Inautiloinautilo
2025-12-12


Stop asking AI about how it works · “The model is just confidently hallucinating its own 'reasoning.’” ilo.im/16932p

_____

2025-12-11

#dailyreport #conference #llm #agents #aiagent #multiagent #reasoning #sber #gigachat
P2: Conference of largest IT gov-corporation.
AI agent build as one major model and small fast models
for tools calling and stop thinking.
Good readings:
- anthropic.com/engineering/clau
- huggingface.co/blog/moe
- switch transformers arxiv.org/pdf/2101.03961

Modularity inside NN have different scales, it is a
tree, large modules is hard to form during
trainingh. Two standard way of using Mixture of Experts
(MoE) is switching whole LLMs OR by replacing most
dense FFN block (inside Trnsformer architecture) that
do some transformation at each token separately, not
mixing tokens and balance them equal participation.

For video models pre-train:
- PyTorch FSDP
- Sequence parellelism
- Activation checkpoing
- Flash Attention 3
- NABLE

Major LLM flaws as I see:
- not ability to say I dont know
- not awarness of current time
- low reasoning-lanning ability

2025-12-11

#dailyreport #conference #llm #agents #aiagent #multiagent #reasoning #sber #gigachat
P1: Conference of largest IT gov-corporation.

They boost reasoning with short memory as
1) chain-of-thought (CoT) Model Context Protocol (MCP)
servers
2) tools: "ToDO" and "think"
3) agent know about his CoT ability and able to explore
environment.
4) main loop have two parts: a reasoning
generator and reasoning validator, second give feedback
and use tools.
5) first step is refrasing of user query used as an
alternative for reasoning to expand context and select
the best refrased query.

The main quality metrics for Data Marker agents is (may
be used for AI agents):
- reading of task, not fast
- marking data with deep diving in context
- who didn't hack platform: find solution by himself
individually

Chat intents %:
- unclassified 29
- friend 23
- adviser 15
- creating 10
- clarification 7
- draw 6
- analyze text 5

Bibliolater 📚 📜 🖋bibliolater@qoto.org
2025-12-09

🧠 **Does mathematics training lead to better logical thinking and reasoning? A cross-sectional assessment from students to professors**

"_The results in this study revealed that in general the greater the mathematics training of the participant, the more tasks were completed correctly, and that performance on some tasks was also associated with performance on others not traditionally associated. A ceiling effect also emerged._"

Cresswell C, Speelman CP (2020) Does mathematics training lead to better logical thinking and reasoning? A cross-sectional assessment from students to professors. PLOS ONE 15(7): e0236153. doi.org/10.1371/journal.pone.0.

#OpenAccess #OA #Research #Article #Maths #Mathematics #Reasoning #Logic #Academia

2025-12-06

ARC Prize 2025 đã khép lại, khẳng định "refinement loops" là xu hướng chính thúc đẩy khả năng suy luận AI. Giải thưởng lớn vẫn chưa có chủ. Đội NVARC dẫn đầu Kaggle với 24% trên ARC-AGI-2; các hệ thống AI thương mại như Opus 4.5 đạt 37.6%, Gemini 3 Pro đạt 54%. Tất cả giải pháp thắng cuộc đều được công bố mã nguồn mở. ARC-AGI-3, tập trung suy luận tương tác, sẽ ra mắt đầu năm 2026.

#ARCPrize #AI #AGI #Reasoning #OpenSource
#ARCPrize #TríTuệNhânTạo #SuyLuậnAI #MãNguồnMở

reddit.com/r

In three experiments, participants were able to appropriately rank good and bad arguments for political positions. However, their own prior political beliefs had a greater impact on evaluation of arguments than did argument quality.

Summary: psypost.org/people-struggle-to

Original paper: sciencedirect.com/science/arti

#Science #Politics #Argument #Reasoning

2025-12-04

I don't think this is about "visual thinking", but only "thinking". Nevertheless, it's a worthwhile article, carefully thought out, and it's a good read. Give it a go?

#misinformation #deception #propaganda #AI #reasoning #thinking #evidence

theconversation.com/visual-thi

2025-12-04

Mô hình AI nhỏ Hito 1.7B, được tinh chỉnh chỉ với ~300 ví dụ, nay có thể đếm chính xác chữ 'r' trong từ 'strawberry' (3 chữ), vượt trội nhiều AI lớn hơn. Đây là bằng chứng cho thấy các mô thức tư duy phức tạp có thể được chuyển giao sang các mô hình nhỏ hơn. Hito sử dụng các 'thẻ tư duy' nội bộ để suy luận và tự sửa lỗi. Một bước tiến thú vị trong AI!

#AI #Hito #LLM #FineTuning #SmallModels #Reasoning
#TríTuệNhânTạo #HọcSâu #MôHìnhNgônNgữ #TinhChỉnhAI

reddit.com/r/LocalLLaMA/commen

N-gated Hacker Newsngate
2025-12-01

DeepSeek just dropped a "world-shattering" that apparently secured an Olympic-style medal in being obtuse. 🏅🔢 But don't worry, it'll only take you a and a few decades to actually decipher what "self-verifiable mathematical reasoning" means. 🤓✨
huggingface.co/deepseek-ai/Dee -Verifiable

CriticalThinkingGamesCriticalThinkingGames@games.ngo
2025-11-29

Empower young people to make BETTER decisions than we've made: support critical thinking skills!

1️⃣ Be curious.

2️⃣ Be questioning.

3️⃣ Be OPEN to what you find.

#criticalthinking #media #education #politics #reasoning #information #knowledge #data #lifeskills #research

2025-11-28

This one really has puzzled me too. Seems to be such a trivial problem to solve, no?

Why can’t #ChatGPT tell time? - The Verge https:// apple.news/A-J-856YmRx-K7op4wc4OYg

#genAI #reasoning

Miguel Afonso Caetanoremixtures@tldr.nettime.org
2025-11-27

"For all the alleged complexity of generative AI, at their core they really are models of language.

The problem is that according to current neuroscience, human thinking is largely independent of human language — and we have little reason to believe ever more sophisticated modeling of language will create a form of intelligence that meets or surpasses our own. Humans use language to communicate the results of our capacity to reason, form abstractions, and make generalizations, or what we might call our intelligence. We use language to think, but that does not make language the same as thought. Understanding this distinction is the key to separating scientific fact from the speculative science fiction of AI-exuberant CEOs.

The AI hype machine relentlessly promotes the idea that we’re on the verge of creating something as intelligent as humans, or even “superintelligence” that will dwarf our own cognitive capacities. If we gather tons of data about the world, and combine this with ever more powerful computing power (read: Nvidia chips) to improve our statistical correlations, then presto, we’ll have AGI. Scaling is all we need.

But this theory is seriously scientifically flawed. LLMs are simply tools that emulate the communicative function of language, not the separate and distinct cognitive process of thinking and reasoning, no matter how many data centers we build.

We use language to think, but that does not make language the same as thought."

theverge.com/ai-artificial-int

#AI #GenerativeAI #LLMs #Chatbots #Intelligence #Reasoning #Neuroscience

2025-11-27

Một người dùng đang thắc mắc về cách lệnh `/set nothink` của Ollama vô hiệu hóa hoặc kiểm soát khả năng suy luận của các mô hình, đặc biệt là các mô hình gpt-oss. Liệu nó có bỏ qua khối 'suy nghĩ', token `<thinking>`, hay chỉ đơn thuần là thay đổi prompt để giảm mức độ suy luận?

#Ollama #AI #LLM #GPTOSS #Reasoning
#MôHìnhAI #SuyLuận #CôngNghệ

reddit.com/r/ollama/comments/1

2025-11-26

MAESTRO — новый фреймворк для построения мультиагентных систем и цифровых ассистентов на основе LLM

Привет, Хабр! За последний год стало ясно, что использование нескольких LLM в агентном режиме приносит существенно больше пользы, чем простая сумма их компьюта по отдельности. Гибкость, распределение ролей и активное взаимодействие моделей позволяет достичь значительных успехов в самых различных задачах, включая создание полезных цифровых ассистентов. Построением таких систем заняты многие команды по всему миру. Чтобы ускорить прогресс в этом направлении и помочь коллегам, мы в группе «Мультимодальные архитектуры ИИ» AIRI создали новый фреймворк под названием MAESTRO — Multi‑Agent Ecosystem of Task Reasoning and Orchestration. Мы представили его на конференции AI Journey 2025, которая прошла в Москве на прошлой неделе. В этой статье нам бы хотелось поподробнее рассказать о нашей разработке, описать устройство фреймворка и дать примеры его использования.

habr.com/ru/companies/airi/art

#мультиагентность #мультиагентные_системы #агент #оркестрация #ассистент #помощник #большая_языковая_модель #взаимодействие_агентов #цепочки_рассуждений #reasoning

Kerrick Long (code)kerrick@ruby.social
2025-11-22

State-of-the-art LLMs continue being unable to reason about the real world.

#ChatGPT #LLM #ChatBot #AI #Reasoning

A screenshot of ChatGPT shows a conversation titled "Cotton ties for books." The viewport shows this snippet of ChatGPT saying the following...

You’d tie it around the book’s short axis — like you’re putting a belt around its waist, not like wrapping a gift.
Picture it this way:
The tape goes around the front cover → across the spine → around the back cover → back to the fore-edge (the outer edge where the pages open).
You tie the knot or bow at the fore-edge, never on the spine.
The knot lives on the side of the book that faces out when you open it.
Because of that placement:
You don’t have to untie anything to read the book.
You slide the book off the shelf, open the cover, and read normally. The tape stays put.
The only time you’d untie it is if you want to change or remove the label.
eicker.news ᳇ tech newstechnews@eicker.news
2025-11-21

#NanoBananaPro, also known as #Gemini3ProImage, is a powerful #imagegeneration model with advanced #reasoning capabilities. It excels at #complextasks, generates #highresolutionimages, and can use #GoogleSearch for #factualaccuracy. The model also offers features like multi-character editing, text rendering, and the ability to mix up to 14 reference images for composition. simonwillison.net/2025/Nov/20/ #tech #media #news

2025-11-21

PsyPost: Social reasoning in AI traced to an extremely small set of parameters. “A new study reveals that the capacity for social reasoning in large language models, a trait similar to the human ‘theory of mind,’ originates from an exceptionally small and specialized subset of the model’s internal parameters.”

https://rbfirehose.com/2025/11/21/psypost-social-reasoning-in-ai-traced-to-an-extremely-small-set-of-parameters/

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst