Just in case you were holding your breath for LLMs that had "reasoning" or "chain of thought" to let them "think" more logically, you might want to consider how the AI companies are going about this.
Which path do you think they've pursued?
(a) link the power of LLM natural language skills with the existing non-generative AI problem-solving, path-exploring, optimization tools, or
(b) get their LLMs to produce a set of steps that statistically looks like it would be a chain of thought, interpolating the already collected training material -- an approach that breaks as soon as the problem to be broken down can't be recognized in the training data?
https://www.youtube.com/watch?v=TSaiuXPe6cw
Asking a chatbot to explain its reasoning (it can't, it just sounds like it) or explain its errors (it can't, it just sounds like it) will produce convincing but completely fabricated text. Asking a chatbot to break a problem down into a series of logical steps and then solve them will only work when there are close enough examples of very similar problems in its training. It can adopt the form and the prosaic features in its answers, but it can't (yet) apply logic.
Maybe some time in some unlimited-power future, the depth of learning in a huge model will abstract and embed the structure of logic, reasoning, analogy, induction, probability etc., from some body of human writing, but it appears (to me) almost impossible to extract that from any existing corpus. The currently available training material has a wealth of examples of simple logic and way too many examples of bad reasoning.
If you know of researchers that are following path (a) above with any success, let me know. I think that's the only way we'll get any useful explorative, extrapolative, less-brittle AI.
#AI #CoT #LLM