Lmst

My daughter just came up with a great exercise: challenge your students to find the title of your PhD using ONLY LLMs (no Google allowed). If any of them manage, they get gummy bears 😃

I asked five different models, and got five different answers, all five of which were completely wrong 😂

#AI #ChatGPT #AISlop #LLM #LLMFail #Education #HigherEducation #AcademicChatter

Model: Meta LLaMA 3.3 70B Instruct (KISSKI) Verfligbare Token: 98.97% Where did Christopher Kyba get his PhD, and what was the title? © Christopher Kyba received his PhD from the University of Guelph. The title of his PhD thesis is ‘Spectral and spatial characterization of anthropogenic light pollution’.

Model: GPT 4.1 Verfligbare Token: 98.94% Where did Christopher Kyba get his PhD, and what was the title? © Christopher Kyba received his PhD from the University of British Columbia. The title of his dissertation was “The search for the electric dipole moment of the neutron.”

Model: GPT 40 Verfiigbare Token: 98.88% Where did Christopher Kyba get his PhD, and what was the title? © Christopher Kyba received his PhD from the University of Washington. The title of his dissertation was “The Role of the Circadian Clock in the Regulation of Cell Division in Neurospora crassa.”

Model: Codestral 22B (KISSKI) Verfiigbare Token: 98.88% Where did Christopher Kyba get his PhD, and what was the title? © Christopher Kyba received his PhD from the University of California, Berkeley. The title of his dissertation is “The Role of the Nucleus Accumbens in the Acquisition and Expression of Conditioned Emotional Responses.”

🤖 Think your AI assistant can really reason? Apple’s puzzle tests say otherwise.
📉 See how “thinking” AIs collapse when logic gets real — and why we might be projecting intelligence where there is none.

Hashtags:
#AIReasoning #ChainOfThought #LLMFail #DeepTech

URL:
https://medium.com/@rogt.x1997/the-illusion-of-thought-why-reasoning-ai-might-be-smarter-than-us-but-not-wiser-73427af99baa

I’m sorry, but I cannot help a tiny bit of Schadenfreude. A colleague is an enthusiastic user of ChatGPT and recently told me that one does not need traditional reference managers like Zotero anymore, since you can just ask the LLM to re-format your references according to a given style. Now he got article proofs back with countless comments that the dates in in-text references don’t match the dates in the bibliography. 🙃 #LLMfail

So, it seems Meta has been lying about being a licensed therapist.

https://www.404media.co/senators-letter-demand-meta-answer-for-ai-chatbots-posing-as-licensed-therapists/

#llm #hallucination #llmfail #meta #MetaAi

I really love shitty automation. Just posted a rant about ai/llm and tagged as such; not a second later and I get 3 fucking boosts of my introduction post.

It's just so funny; not even that can it handle correctly.

#aifail #llmfail

@peter_mcmahan If some researchers and evaluators can't be bothered to write the questions, analyze the data, and report findings

- and all of these are things I've heard researchers publicly crowing in pride about -

then why should people bother responding?

#evaluation #research #LLMfail #RealEvalTalk

I absolutely love this backfiring use of GPT. I wonder how many customer service reps were let go as part of the business case for this one? #fail #DPDFail #GPTFail #llmfail https://www.theguardian.com/technology/2024/jan/20/dpd-ai-chatbot-swears-calls-itself-useless-and-criticises-firm

LLM generated bug reports messing up open-source development. Creator of curl is summing up all the pain:

https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-for-intelligence/

#LLMfail #opensource

@jamiemccarthy But doesn't that double the cost of fulfilling the service request? LLM-as-a-Service was already a dubious financial operation, profit-wise, I thought.

(For text, I've wondered if you could just google individual sentences from the response, and rewrite if there were too many hits.)

#llm #llmfail #llmplagiarism

I was upset and troubled to see a large language model interpretation of a well known political image, suggesting that wealth hoarders in society foment differences and prejudice between those with little, to protect their wealth.

The LLM summary returned:
"a painting of two men sitting at a table with a plate of food"

what the {expletive deleted}?!?
There are Three People!
How can any interpretation be that there are two people in this image?!?

#llm #llmfail #disgusted #unacceptable

well known cartoon suggesting that wealth hoarders in society foment differences and prejudice between those with little to protect their wealth.
"Careful mate ... that person wants your cookie!"

Chatbot-generierte Pilzführer bei Amazon hatte ich bisher nicht auf dem Schirm beim Thema „tödliche Gefahren durch künstliche Intelligenz“.
https://www.techtimes.com/articles/295863/20230901/beware-ai-written-mushroom-foraging-guides-amazon-experts-warn.htm
#ai #llmfail #llm #pilze

@ashley 🙀

Edited:

I was really hoping for an #LLMFail here, but the source site has both definitions.

So, old school extractive summarization fail.

Still, hilarious!

#LLMFail

Client Info