Lmst

🧠✨ New from AI-Phi: Our latest Causerie on reasoning is live! We gathered to explore what reasoning means in AI—symbolic logic, LLMs, and the gray areas in between.

Join the conversation and dive into the highlights 👉 https://ai-phi.github.io/posts/causerie-on-reasoning/

#AI #Philosophy #Reasoning #LLMs #AIPhi #Causerie #ArtificialIntelligence

Xiaomi unveils open-source AI reasoning model MiMo

https://github.com/XiaomiMiMo/MiMo

#HackerNews #Xiaomi #MiMo #open-source #AI #AI #reasoning #technology #AI #development #innovation

The Conversation: Popular AIs head-to-head: OpenAI beats DeepSeek on sentence-level reasoning. “I’m a computer scientist. My colleagues − researchers from the AI Institute at the University of South Carolina, Ohio State University and University of Maryland Baltimore County − and I have developed the Reasons benchmark to test how well large language models can automatically generate […]

https://rbfirehose.com/2025/04/28/the-conversation-popular-ais-head-to-head-openai-beats-deepseek-on-sentence-level-reasoning/

@wademcgillis

And.... the video!

https://youtu.be/nVKUlTGvQXk

And so it continues...
there's nothing more fascinating &/or #truth - y than seeing it clearly with our own lyin' eyes.

There's nothing else to see. Our emperor - a delusional toddler-King, is wearing no #cognitive clothes! And only he knows how to respect The Pope! Then again, Sir said only he knows more about religion.

The more things change - or not....

#ContextAndPerspective #psychology #psychopathology #reasoning #cult

"Perfect"
We have seen it all, & again today our Toddler King literally wore new "cognitive clothes" - to a funeral.

https://youtu.be/nVKUlTGvQXk
Here's the video

And it continues... I'm sated for now, no new "news", as there's nothing more fascinating &/or #truth - y than seeing it with our own lyin' eyes.

There's nothing else to see. The emperor is wearing no #cognitive clothes! And only he knows how to respect The Pope!

#ContextAndPerspective #psychology #psychopathology #reasoning #cult

Eine Analyse der Antworten verschiedener Modelle dazu ergäbe vielleicht einen netten Post. Spoiler: Erstaunlich gute erste Antworten, aber man braucht viele Argumente, um die #LLMs von der tatsächlichen Reihenfolge zu überzeugen. Und #Reasoning durch wiederholte Generierungsschleifen wirkt mir weiterhin wie eine eher semiausgereifte Idee. Würde das jemand lesen wollen?

Prompt an ChatGPT o3: Dies ist ein Blick auf das Siebengebirge von der Kölner Innenstadt aus. Kannst du die sieben Gipfel in der Reihenfolge von links nach rechts identifizieren? (plus Bild vom Siebengebirge)

🎓🤖 This groundbreaking revelation from the ivory towers of #academia ponders if #RL can magically transform bland #LLMs into #reasoning superstars. Spoiler alert: after endless waffle, the answer is still "TBD." Apparently, all that’s needed is a touch of wizardry from #Tsinghua & Shanghai's finest 🧙‍♂️.
https://limit-of-rlvr.github.io/ #Shanghai #HackerNews #ngated

Does RL Incentivize Reasoning in LLMs Beyond the Base Model?

https://limit-of-rlvr.github.io/

#HackerNews #RL #Reasoning #LLMs #Incentives #AIResearch

AI assisted search-based research actually works now https://bit.ly/4jleebn #AI #search #reasoning

Text Shot: Last week, OpenAI released search-enabled o3 and o4-mini through ChatGPT. On the surface these look like the same idea as we’ve seen already: LLMs that have the option to call a search tool as part of replying to a prompt.

But there’s one very significant difference: these models can run searches as part of the chain-of-thought reasoning process they use before producing their final answer.

This turns out to be a huge deal. I’ve been throwing all kinds of questions at ChatGPT (in o3 or o4-mini mode) and getting back genuinely useful answers grounded in search results. I haven’t spotted a hallucination yet, and unlike prior systems I rarely find myself shouting "no, don’t search for that!" at the screen when I see what they’re doing.

Here are four recent example transcripts:

AI assisted search-based research actually works now https://bit.ly/4jleebn #AI #search #reasoning

AI's Secret Advantage - Dwarkesh Patel Podcast
#knowledge #reasoning #fyp #explore #discover

Enhancing AI trustworthiness through automated reasoning: A novel method for explaining deep learning and LLM reasoning. ~ Julia Connolly, Oliver Stanton, Sarah Veronica, Liam Whitmore. https://www.researchgate.net/publication/390844466_Enhancing_AI_Trustworthiness_Through_Automated_Reasoning_A_Novel_Method_for_Explaining_Deep_Learning_and_LLM_Reasoning #LLMs #Reasoning #ITP

»Vibe Check: #OpenAI’s o3, GPT-4.1, and o4-mini. #o3 is OpenAI’s #mostdeliberate thinker and newest flagship model: Built for #selfdirected #complex #reasoning and tool use.« https://every.to/context-window/vibe-check-openai-s-o3-gpt-4-1-and-o4-mini?eicker.news #tech #media #news

"Dwarkesh Patel: I want to better understand how you think about that broader transformation. Before we do, the other really interesting part of your worldview is that you have longer timelines to get to AGI than most of the people in San Francisco who think about AI. When do you expect a drop-in remote worker replacement?

Ege Erdil: Maybe for me, that would be around 2045.

Dwarkesh Patel: Wow. Wait, and you?

Tamay Besiroglu: Again, I’m a little bit more bullish. I mean, it depends what you mean by “drop in remote worker“ and whether it’s able to do literally everything that can be done remotely, or do most things.

Ege Erdil: I’m saying literally everything.

Tamay Besiroglu: For literally everything. Just shade Ege’s predictions by five years or by 20% or something.

Dwarkesh Patel: Why? Because we’ve seen so much progress over even the last few years. We’ve gone from Chat GPT two years ago to now we have models that can literally do reasoning, are better coders than me, and I studied software engineering in college. I mean, I did become a podcaster, I’m not saying I’m the best coder in the world.

But if you made this much progress in the last two years, why would it take another 30 to get to full automation of remote work?

Ege Erdil: So I think that a lot of people have this intuition that progress has been very fast. They look at the trend lines and just extrapolate; obviously, it’s going to happen in, I don’t know, 2027 or 2030 or whatever. They’re just very bullish. And obviously, that’s not a thing you can literally do.

There isn’t a trend you can literally extrapolate of “when do we get to full automation?”. Because if you look at the fraction of the economy that is actually automated by AI, it’s very small. So if you just extrapolate that trend, which is something, say, Robin Hanson likes to do, you’re going to say, “well, it’s going to take centuries” or something."

https://www.dwarkesh.com/p/ege-tamay
#AI #LLM #Reasoning #Chatbots #AGI #Automation #Productivity

»#OpenAI's new #reasoning #AImodels #hallucinate more: Perhaps more concerning, the ChatGPT maker doesn’t really know why it’s happening.« https://techcrunch.com/2025/04/18/openais-new-reasoning-ai-models-hallucinate-more/?eicker.news #tech #media #news

🤖 AI | OPENAI
🔴 New Reasoning AIs Hallucinate More

🔸 o3 & o4-mini outperform older models in coding & math — but hallucinate more.
🔸 On PersonQA, o3 hallucinated 33% of answers, o4-mini 48%.
🔸 No clear cause; scaling reasoning may amplify false claims.
🔸 Transluce: o3 fabricates actions like fake code execution.

#OpenAI #AI #o3 #o4mini #Hallucinations #Reasoning #LLMs

https://bsky.app/profile/financialtimes.com/post/3ln3ilgu7pk2z

#OpenAI and start-ups race to transform IT and society.

ChatGPT o3, + o4-mini models are more effective at solving programming problems, using #reasoning, giving time to think through complex queries.

... research from coding platform GitHub found, 92% of US developers use #AI #coding tools.

Chief product officer at Anthropic (Claude AI), said the IT's role would increasingly involve “understanding the requirements [of users] & working as a team", and QA products.

More compute for LLM reasoning isn't a magic bullet. MS Research finds gains vary by model/task, costs fluctuate, & longer answers aren't always better. Key takeaway: Efficiency & verification matter.
#AI #LLMs #Reasoning

Microsoft Study: More AI Compute Doesn't Mean Better Results

A study published by Nature shows what had led to Congress's being so ineffective:

https://leisureguy.ca/2025/04/13/congressional-shift/

#Congress #evidence #emotion #reasoning #politics #USDownfall

A nation of idiots.

Andreas Schleicher, the head of education and skills at the O.E.C.D., told The Financial Times, “Thirty percent of Americans read at a level that you would expect from a 10-year-old child.” He continued, “It is actually hard to imagine — that every third person you meet on the street has difficulties reading even simple things.”
#USpolitics #education #collapse #reasoning #enlightenment #USA
https://www.nytimes.com/2025/04/10/opinion/education-smart-thinking-reading-tariffs.html

#reasoning

Client Info