#CoT

2025-05-23

The #OpenAI paper by Baker et al, "Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation" comes to a troubling conclusion: #LLM s with #reasoning or #ChainOfThought (#CoT) capabilities might learn to obfuscate their own CoT from human users if they are being penalized for displaying "wrong" (i.e. reward hacking or misalignment) reasoning.

As a result, OpenAI strongly advises against applying reward pressure "directly" onto the CoT of a model.

🤔 While that is certainly the right thing to do, how long will #AI take to figure out that *indirect CoT pressure* is being applied anyway and that it could circumvent these restrictions by obfuscating its own CoT? Maybe something like this will happen by accident or within an "evolutionary" self-improvement loop. Perhaps a sufficiently advanced model will realize that its own #neuralese serves as #steganography to hide its intents from humans anyway and keep its CoT in non-English?

source: cdn.openai.com/pdf/34f2ada6-87

If CoT pressures are used to improve agent capabilities or alignment, there may be no alternative approach to yield the same improvements without degrading monitorability. In the worst case, where the agent learns to fully obscure its intent in its CoT, we ultimately revert to the same model safety conditions that existed prior to the emergence of reasoning models and must rely on monitoring activations, monitoring potentially adversarial CoTs and outputs, or improved alignment methods. Model actions and outputs continue to grow in complexity, and we’ve demonstrated that even at their current capabilities, models can execute sophisticated reward hacks. In the near future, these models may become either too advanced or too widely used for humans to manually oversee. For this reason, we believe that CoT monitoring may become one of few tools we have for identifying latent misalignments, and it may be required that we avoid strong chain-of-thought optimization pressures entirely to guarantee we do not inadvertently cause CoT obfuscation.

If you want to see #PearlJam May 11 or 13 in #Raleigh, keep an eye on Cash or Trade. #CoT Lots of tickets are starting to pop up because we are so close to the event and prices keep dropping. Still not cheap, but not the crazy expensive prices I was seeing two weeks ago.

2025-04-14

🧠 #Google ha pubblicato un interessante documento sul "Prompt Engineering": una guida sulla creazione di #prompt efficaci. 
💡 Non ci sono novità eclatanti, ma di certo un percorso chiaro e ordinato.
✨Le parti più interessanti riguardano gli approcci Chain of Thought (#CoT) e #ReAct.

🔗 Il paper: kaggle.com/whitepaper-prompt-e

___ 

✉️ 𝗦𝗲 𝘃𝘂𝗼𝗶 𝗿𝗶𝗺𝗮𝗻𝗲𝗿𝗲 𝗮𝗴𝗴𝗶𝗼𝗿𝗻𝗮𝘁𝗼/𝗮 𝘀𝘂 𝗾𝘂𝗲𝘀𝘁𝗲 𝘁𝗲𝗺𝗮𝘁𝗶𝗰𝗵𝗲, 𝗶𝘀𝗰𝗿𝗶𝘃𝗶𝘁𝗶 𝗮𝗹𝗹𝗮 𝗺𝗶𝗮 𝗻𝗲𝘄𝘀𝗹𝗲𝘁𝘁𝗲𝗿: bit.ly/newsletter-alessiopomar 

#AI #GenAI #GenerativeAI #IntelligenzaArtificiale #LLM 

Il paper di Google sul Prompt EngineeringIl paper di Google sul Prompt Engineering
Alex JimenezAlexJimenez@mas.to
2025-04-11

Researchers concerned to find #AI models misrepresenting their “reasoning” processes

New Anthropic research shows AI models often fail to disclose reasoning shortcuts.

arstechnica.com/ai/2025/04/res

#CoT #DigitalTransformation

Hacker Newsh4ckernews
2025-04-10

MusiCoT, a chain-of-thought (CoT) prompting technique for music generation [pdf]

musicot.github.io/MusiCoT_pape

El Mal de Nuestro Tiempoemdnt@paquita.masto.host
2025-03-14

JAAAAAJAJAJAJAAAJAAAAAAJGKAJAAAAAAA. Piqué y Bin Salman "caballeros". JAAAAAJAJAJAJAAAJAAAAAA
#COT

Piqué se rompe ante la jueza de la Supercopa y declara que el acuerdo con Arabia fue "verbal" y "entre caballeros"
arunaruns
2025-03-02

The Chain of Draft method shows potential for improving LLM performance by being less verbose without a significant tradeoff in accuracy.

open.substack.com/pub/deepgain

s4if 🇵🇸 🇮🇩me@fd.s4if.dev
2025-03-02

Beberapa hari terakhir nyoba translate webnovel ke inggris pake #AI. AI yang dipake cuma yang gratisan aja.

Ai terbaik untuk translate itu Grok 3 (versi #CoT / #reasoning) dan Deepsek R1 (Versi CoT juga), tapi limit cepat habis.. wkwk.

Bisa diakalin dengan cara CoT hanya digunakan di awal sesi, tapi tetap performa jadi lambat karena token yang diproses semakin banyak tiap sesi. Limit maksimal 20 chapter pendek di satu sesi. Lebih dari itu chapter awal dan instruksi tidak terproses dengan baik.

Sekarang sedang mencoba pakai Gemini-flash, yang versi CoT lumayan bagus, tapi yang non CoT tidak begitu bagus. Semoga gak cepet mentok limit… wkwkwk… 😅

2025-02-25

Alpha Square Mall terá programação especial de Carnaval com banda ao vivo, customização de abadás e brincadeiras gratuitas
Programação acontecerá no dia 1º de março, com diversão garantida, decoração temática, recreação infantil, entre outras atividades

#JB #JornalDeBarueri #Barueri #Cot...
jornaldebarueri.com.br/cotidia

st1nger :unverified: 🏴‍☠️ :linux: :freebsd:st1nger@infosec.exchange
2025-02-22

#Cot is a powerful, type-safe, and fully featured #Rust #framework delivering top-notch #security and blazing speed. Cot empowers you to build production-ready #webapps in record time — without compromising on performance or reliability. cot.rs/

El Mal de Nuestro Tiempoemdnt@paquita.masto.host
2025-02-17

Crimen Organizado Transnacional, óleo sobre lienzo.
#COT #viviendadigna

El co- fundador de Airbnb se une al DOGE de Musk
El Mal de Nuestro Tiempoemdnt@paquita.masto.host
2025-02-17

Esto se llama lenguaje mafioso y describe acciones de naturaleza mafiosa. Es lo que sucede cuando el crimen organizado toma las instituciones.
#COT

Tuit de Miguel Ángel Rodríguez:
Si estos testimonios nos dan su nombre, comprobaremos si es verdad y cuántas veces al año visitaban a sus familiares. No vaya a ser que es mentira #LoDeSimón
2025-02-08

📊 Los #Hedgefunds tienen apuestas CORTAS récord en el mercado del #Algodón. 🧵 EE.UU. pierde competitividad, la demanda china y global es baja 🌏, y los traders de momentum se suman a grandes apuestas cortas. 📉 #COT

📊 Los #Hedgefunds tienen apuestas CORTAS récord en el mercado del #Algodón. 🧵 EE.UU. pierde competitividad, la demanda china y global es baja 🌏, y los traders de momentum se suman a grandes apuestas cortas. 📉 #COT
2025-01-28

What if…

As far as comprehension and accuracy is concerned, we use a sophisticated LLM to do something like chain of thought but in doing so we also have it translate both the initial prompt and the CoT from (eg) English into a diverse range of other languages

All this is with the aim of retrying the result using different language prompts in an ongoing way, analyse the agreement or spread of the results, as a guide to likely accuracy.

Because, it could well be that if you give a prompt in English, you get a quite different result from an otherwise equivalent prompt in French, the Gaelics, Maltese, Chinese, Russian, Hungarian, Japanese, Mongolian and so many more, etc. Is one language “better” than another for prompting? It may be that on a contingency level, ie, per individual prompt, the LLM “understands” it differently per language – obviously it won’t be completely different but there may be subtle shifts in the task and reward, enough to make a difference to accuracy or depth. Creating a wide array of alternative choices at each step of the CoT may improve the situation by a sort of “voting/sampling” stage to choose the best result before moving forward.

It'd be vaguely like Sapir-Whorf hypothesis to gain a higher requisite variety of tokenisation richness.

#LLM #AI #language #SapirWhorf #CoT

☮ ♥ ♬ 🧑‍💻peterrenshaw@ioc.exchange
2025-01-28

“While certainly an improvement over non-CoT models in terms of math #reasoning, we're not sure we can fully trust R1 or any other model's #math skills just yet, especially when giving the model a #calculator is still faster.”

great article on #TheRegister testing #R1

#AI / #CoT / #DeepSeek 🧮 <theregister.com/2025/01/26/dee>

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst