My conclusions of watching most of the chess matches between current LLM on the kaggle games arena (like this one: https://youtu.be/RALDtg1hSTQ?si=fykXVsJ9vzalm7ui )
- if you an AI hater, watch these. Never before have I seen a similar combination of hubris and hallucination. A typical thinking process goes like „<move> is the natural forcing move, disabling black‘s counterplay. After <impossible move> I counter with the powerful <move of a piece that is not there> and I‘m in a strong winning position“. Particularly #deepseek R1 was, well, full of it.
- I was amazed that an unmodified LLM, particularly #gemini25pro and #chatgpto3, can play beginner-level chess - nearly no illegal moves, grasping fundaments of chess, sometimes blundering pieces, sometimes making actual cool combinations (e.g. preparing a fork, preparing a mate in the next move).
- #Grok first destroyed a hapless component but later showed that is not on par with OpenAI‘s and Google‘s LLM.
And finally, kudos to Google for the idea - games are a great way to see how successful AIs can reason. Looking forward to werewolf and poker.
(And of course I would like to see #gpt5 there)