Lmst

#enough2skim

So many warn that evaluating with GPT favors GPT

(or any LLM evaluating itself).

Now it is also shown

Science, not just educated guesses

(Fig: T5, GPT, Bart each prefer their own) https://arxiv.org/abs/2311.09766

A new benchmark for data 📚
Rather than test if a model is good
This tests whether you can filter data
360 languages

🤖: Detecting if chatGPT made this text...
It did not
A survey on the (few) datasets and methods to detect it
https://arxiv.org/abs/2309.07689

(not sure why chatGPT and not LLM in general, but NVM)
#enough2skim #NLP #nlproc #chatgpt #LLM #LLMs #AGI

Predictions throughout training, hyperparams and architectures are yet again shown to be on

a small manifold

which means models learn their classifications outputs similarly
https://arxiv.org/abs/2305.01604
Mao ... @pratikac
#MachineLearning #enough2skim

Few-shot learning almost reaches traditional machine translation

20 questions can now be played by computers
you probably all know @akinator_team@twitter.com that can guess what you thought about

https://arxiv.org/pdf/2301.08718.pdf
propose the other role
They pick a character and will answer yes or no
(basically, QA over wiki+ tweaks)

Version: 2025.04