Lmst

[New Blog Post] State of Knuckledragger III: Kernel Changes, Symbolic Union, AI, and more https://www.philipzucker.com/state_o_knuck_3/ #python #logic #theoremproving

Mehtaab Sawhney (@mehtaab_sawhney)

OpenAI 내부 모델이 수학 문제 'Erdos #846'의 증명을 생성해 논문으로 공개됐습니다(링크 포함). 해당 문제는 기존 문헌에서 파생될 수도 있지만, 내부 모델이 직접 증명을 도출한 사례라는 점에서 주목받고 있습니다. 작성자는 모델의 증명을 보고 감탄했다고 언급했습니다.

https://x.com/mehtaab_sawhney/status/2026716221933343147

#openai #theoremproving #airesearch #mathematics

Terminal now can help you with formal proofs and theorem provers 🤯

📐 **lean-tui** — A TUI for visualizing Lean programs and proofs

💯 Live proof trees, data/effect flow views & real-time updates from your editor

🦀 Written in Rust & built with @ratatui_rs

⭐ Source: https://codeberg.org/wvhulle/lean-tui

#rustlang #ratatui #tui #lean #theoremproving #cli #devtools #terminal

Lean 4 is apparently the new secret sauce of #AI dominance, because who knew that theorem proving could be so *riveting*? 🤔✨ But don't worry, before you can learn how to take over the world with math, you'll need to pass the Vercel Security Checkpoint IQ test, where only the chosen ones with #JavaScript enabled may proceed. 🛂🔒
https://venturebeat.com/ai/lean4-how-the-theorem-prover-works-and-why-its-the-new-competitive-edge-in #Lean4 #TheoremProving #VercelSecurity #HackerNews #ngated

Kimon Fountoulakis (@kfountou)

작성자는 모델이 기존 논증의 짧은 연쇄를 올바르게 결합해 내는 능력을 과소평가하지 않으며, 이는 인상적이고 수학자들의 작업 방식을 바꿀 것이라 말합니다. 다만 이러한 성과가 인간이 믿었던 것처럼 문제들이 본질적으로 훨씬 더 어렵다는 관점의 정당성을 완전히 입증하는지는 불확실하다고 지적합니다.

https://x.com/kfountou/status/2022671762173854080

#ai #theoremproving #automatedreasoning #research

Jakub Pachocki (@merettm)

저자는 'First Proof' 챌린지에 큰 기대를 표하며, 차세대 AI 모델 능력 평가에 있어 새로운 최전선 연구가 중요하다고 강조합니다. 내부적으로 제한된 인간 감독 하에 제안된 10개 문제에 대해 자사 모델을 실행했다고 밝히며, 이는 AI의 수학적 증명 능력과 자율성 평가에 관한 중요한 실험임을 시사합니다.

https://x.com/merettm/status/2022517085193277874

#firstproof #ai #theoremproving #research #ml

Kimon Fountoulakis (@kfountou)

자율 에이전트가 자체적으로 수학적 추측(conjecture)을 생성하고 증명하는 능력을 갖추게 될 것이라는 전망. 이는 박사학위 과정과 연구자의 역할이 AI 기반 자동화 도구의 등장으로 바뀔 수 있음을 경고하는 내용으로, 자동화된 정리 증명과 연구 보조의 확산을 시사한다.

https://x.com/kfountou/status/2022756585131429994

#aiagents #theoremproving #automatedresearch #ai

Agentic system design paper:
https://althofer.de/agentic_strategy_design_for_math_proofs.pdf
#1stProof #TheoremProving

First Proof (#1stProof): We ran an AI-only workflow (no human mathematical input) and published a writeup + outputs.
Report: https://althofer.de/first-proof-competition/first-proof-report.html
Official: https://1stproof.org/
I’d appreciate critique—especially rigor/correctness checks and suggestions for better verification.
#1stProof #Mathematics #TheoremProving #AI

Quoc Le (@quocleix)

새 연구 'Semi-Autonomous Mathematics Discovery with Gemini' 공개: Gemini를 사용해 Erdős Problems 데이터베이스의 700개 공개 추측을 체계적으로 평가했고, 13개 문제를 다루어 그중 5건에서 자율적으로 새로운 해법을 찾아낸 성과를 보고함. 자율적 수학 발견 연구의 중요한 사례.

https://x.com/quocleix/status/2018402933193539735

#gemini #theoremproving #airesearch #mathdiscovery

fly51fly (@fly51fly)

J. Urban의 논문은 단기간(2주) 동안 13만 줄 규모의 형식적 위상수학(formal topology)을 자동 형식화(autoformalization)로 생성한 작업을 보고합니다. 비용과 복잡도를 낮춘 간단한 방법을 제안해 누구나 자동 형식화에 접근할 수 있게 하는 접근법과 실험 결과를 제시하며 정리된 데이터셋과 파이프라인을 공개합니다 (arXiv:2601.03298).

https://x.com/fly51fly/status/2013735633425146080

#autoformalization #formalization #theoremproving #automatedreasoning

Wes Roth (@WesRothMoney)

Grok 4.20이 인간 수학자가 하지 못하던 작업을 수행하고 있다는 주장으로, 'Automated Theorem Discovery' 시대의 도래를 예고한다는 내용입니다. 트윗은 또한 Elon(머스크)이 여러 방식으로 '전쟁'에 나선다는 언급을 더해 향후 파장이 클 것이라고 전망합니다.

https://x.com/WesRothMoney/status/2011673556149744032

#grok #automatedtheoremdiscovery #theoremproving #ai

AI Leaks and News (@AILeaksAndNews)

AxiomProver가 2025년 Putnam 경시에서 만점(12/12)을 기록했다는 소식입니다. Putnam은 학부 수학 경시의 최고 권위 대회로 알려져 있으며, 이번 성과는 인간 참가자 중 어느 누구도 달성하지 못한 수준으로 소개됩니다. 해당 결과는 수학 자동화(정리 증명·자동 추론) 분야에서 AI의 중요한 이정표로 평가됩니다.

https://x.com/AILeaksAndNews/status/2010081100748730492

#axiomprover #putnam #theoremproving #ai #automatedreasoning

Deedy (@deedydas)

AI가 세계에서 가장 어려운 수학 경시인 Putnam에서 만점(120/120)을 달성했다는 보도입니다. Axiom의 AI 정리 증명기가 증명 보조 시스템 Lean 위에서 모든 문제를 해결하고 해답을 공개했고, 이는 수학 분야에서 AI의 큰 이정표로 평가됩니다.

https://x.com/deedydas/status/2010046352454209583

#axiom #lean #putnam #theoremproving #ai

Kevin Weil (@kevinweil)

GPT-5.2와 @HarmonicMath 관련 짧은 언급으로, GPT-5.2를 이용해 수학 문제 'Erdos #728'을 해결하고 형식화(formalize)하려는 시도를 표현하고 있습니다. 자동 정리·증명과 형식화에 GPT-5.2와 HarmonicMath가 연계되는 혁신적 적용 가능성을 시사합니다.

https://x.com/kevinweil/status/2008737145016775165

#gpt5.2 #harmonicmath #theoremproving #automatedreasoning

Lean Theorem Prover Mathlib

https://github.com/leanprover-community/mathlib4

#HackerNews #Lean #Theorem #Prover #Mathlib #mathlib4 #LeanProver #theoremProving #functionalProgramming

Does anyone know if an inductive Nat datatype defined as a place-value system could replace the need to rewrite the PA definition to bigints in the compiler?

#theoremProving #types #functional_programming

DeepSeekMath‑V2 è un AI che dimostra teoremi matematici passo dopo passo.
Genera prove, le verifica con un LLM dedicato e corregge gli errori per migliorarsi continuamente. 🤖📐

#AIperLaMatematica #TheoremProving #VerificaAutomatica

New research shows how Lean4 can turn large‑language models into AI advisers that pair hypotheses with physics‑consistent proofs. This blend of theorem proving and formal verification promises safer AI and more reliable software. Dive into the details of how AI meets the laws of physics. #Lean4 #TheoremProving #PhysicsConsistent #FormalVerification

🔗 https://aidailypost.com/news/lean4-powers-ai-advisers-pair-hypotheses-physicsconsistent-proofs

#CondensedDetachment example

Axiom 1: ⊢ (𝜑 → (𝜓 → 𝜑))
Axiom 2: ⊢ ((𝜑 → (𝜓 → 𝜒)) → ((𝜑 → 𝜓) → (𝜑 → 𝜒)))
Rule of Modus Ponens:
• Major hypothesis: ⊢ (𝜓 → 𝜑)
• Minor hypothesis: ⊢ 𝜓
• Resulting Assertion: ⊢ 𝜑
——
D<major><minor> applies the Rule of Modus Ponens treating the two given tautologies as having metavariables living in different namespaces and returning the normalized result. We extend by using underscore as a placeholder, so D__ recovers the rule of modus ponens.
——
"D2_" is proof of the rule:
• Hypothesis: ⊢ (𝜑 → (𝜓 → 𝜒))
• Resulting assertion: ⊢ ((𝜑 → 𝜓) → (𝜑 → 𝜒))
——
"D21" is a proof which unifies "1" ⊢ (𝜑′ → (𝜓′ → 𝜑′)) with the hypothesis of "D2_" giving the substitution map 𝜎: {𝜑′ ↦ 𝜑, 𝜓′ ↦ 𝜓, 𝜒 ↦ 𝜑} resulting in the tautology: ⊢ ((𝜑 → 𝜓) → (𝜑 → 𝜑))

(Note that unification can map variables from either side, but when faced with a variable matching a term has to match the variable to that term.)
——
"DD21_" is proof of the rule:
• Hypothesis: ⊢ (𝜑 → 𝜓)
• Resulting assertion: ⊢ ((𝜑 → 𝜑)
——
"DD211" is a proof which unifies "1" ⊢ (𝜑″ → (𝜓″ → 𝜑″)) with the hypothesis of "DD21_" giving the substitution map 𝜎: {𝜑″ ↦ 𝜑, 𝜓 ↦ (𝜓″ → 𝜑)} resulting in the tautology: ⊢ (𝜑 → 𝜑)

This has been adapted and expanded from a run of my symbolic-mgu pre-release crate. https://crates.io/crates/symbolic-mgu

cargo run -r --bin compact -- --wide D__ 1 2 D2_ D21 DD21_ DD211

#math #logic #theoremProving #rust #mostGeneralUnifier #mgu

#theoremproving

Client Info