#Reinforcementlearning

Assn for Computing MachineryACM@mastodon.acm.org
2025-06-06

"Intelligence is figuring out how the world works rather than waiting for someone to tell you how the world works."

Join us as we hear from Andrew Barto and Richard Sutton, the 2024 #ACMTuringAward recipients as they discuss their work on #ReinforcementLearning.

vimeo.com/1085726612

Python Weekly 🐍python_discussions
2025-06-06

This Python class offers a multiprocessing-powered Pool for efficiently collecting and managing experience replay data in reinforcement learning.

github.com/NoteDance/Pool

Discussions: discu.eu/q/https://github.com/

2025-06-04

What do a baby learning to walk and AlphaGo’s legendary Move 37 have in common?
They both learn by doing — not by being told.
That’s the essence of Reinforcement Learning.

It's great to see that my article on Q-learning & Python agents was helpful to many readers and was featured in this week's Top 5 by Towards Data Science. Thanks! :blobcoffee: And make sure to check out the other four great reads too.

-> linkedin.com/pulse/whats-our-r

#Reinforcementlearning #AI #Python #DataScience #KI #alphago #google #googleai #ArtificialIntelligence

Dr. Thompsonrogt_x1997
2025-06-03

🧠 Can two AI agents form a bond that feels like love?

From shared rewards to partner modeling, discover how machines are showing signs of synthetic affection — and what it means for the future of AI ethics, design, and psychology đŸ€–đŸ’”

👉 Read now:
medium.com/@rogt.x1997/do-mach


medium.com/@rogt.x1997/do-mach

Hacker Newsh4ckernews
2025-06-02
2025-06-01

Just getting into Reinforcement Learning?
This book helped me a lot. And it's beginner-friendly:
:blobcoffee: Reinforcement Learning: An Introduction by Sutton & Barto
incompleteideas.net/book/the-b

#ai #ki #artificialintelligence #reinforcementlearning #python #technology #agenticai

2025-05-28

Reinforcement Learning doesn’t tell you what’s right.
It only tells you how good your choice was.
No feedback on what to do. Only on how it went.

:blobcoffee: Example: A multi-armed bandit (like a slot machine with several levers). You don't know which lever is the best - you can only find out by trying it out. Exploring means giving up a known reward (from exploitation) — in hopes of finding a better one.

This balance between exploration and exploitation is the central dilemma in reinforcement learning.

:blobcoffee: A simple strategy is Δ-greedy:
→ In 90% of cases you take the best known action
→ In 10% of cases, you try a different one by chance

In simulations, Δ-greedy methods perform better in the long term than pure greed (always take the supposedly best) - because they master the “explore-exploit trade-off”.

#ReinforcementLearning #ML #KI #AI #DataScience #MachineLearning #Datascientist

2025-05-27

What does a baby learning to walk have in common with AlphaGo’s Move 37?

Both learn by doing — not by being told.

That’s the essence of Reinforcement Learning.

In my latest article, I explain Q-learning with a bit Python and the world’s simplest game: Tic Tac Toe.

-> No neural nets.
-> Just some simple states, actions, rewards.

The result? A learning agent in under 100 lines of code.

Perfect if you are curious about how RL really works, before diving into more complex projects.

Concepts covered:
:blobcoffee: Δ-greedy policy
:blobcoffee: Reward shaping
:blobcoffee: Value estimation
:blobcoffee: Exploration vs. exploitation

Read the full article on Towards Data Science → towardsdatascience.com/reinfor

#Python #ReinforcementLearning #ML #KI #Technology #AI #AlphaGo #Google #GoogleAI #DataScience #MachineLearning #Coding #Datascientist #programming #data

2025-05-27

✹ Domine o Q-Learning com Reinforcement Learning!
📝 Aprenda como o Reinforcement Learning torna a criação de agentes Q-Learning mais simples e eficiente! Descubra os benefĂ­cios, passos essenciais e como essa tĂ©cnica pode revolucionar o seu trabalho com inteligĂȘncia artificial. Clique no link e mergulhe nesse universo fascinante!
.
.
.
inkdesign.com.br/reinforcement

trndgtr.comtrndgtr
2025-05-27

AI Learns by Watching - Sholto & Trenton on Dwarkesh

ChaosHacker-Talkchaoshackertalk
2025-05-24

📱 Reinforcement Learning ist zurĂŒck!

sagen viele. Aber die KI-Szene weiß: Es war nie weg.
Was steckt hinter diesem "Comeback"? Und warum feiern wir stÀndig alte Konzepte als neue Hypes?
đŸŽ„ Jetzt reinschauen!

youtube.com/shorts/SkfFjb_NYuM

2025-05-07

#Promptengineering is crucial for developing #LLM-based apps, but it's often manual & inefficient. PRewrite is an automated method using an LLM trained with #reinforcementlearning to optimize prompts arxiv.org/pdf/2401.08189 #RL #AI

N-gated Hacker Newsngate
2025-04-30

đŸŽšđŸ€–ART: Now you can train your "LLM agents" with Open-source Reinforcement Learning, because training them closed-source would just be too mainstream. Because what the world really needed was more and fewer actual results. đŸš€đŸ’»
github.com/OpenPipe/ART

Hacker Newsh4ckernews
2025-04-30
trndgtr.comtrndgtr
2025-04-27

Training AI to Persuade? - Jeremie & Edouard Harris on JRE

Client Info

Server: https://mastodon.social
Version: 2025.04
Repository: https://github.com/cyevgeniy/lmst