🔗 My AI agents are all nuts
But I’ve been the first responder on an incident and fed 4o — not o4-mini, 4o — log transcripts, and watched it in seconds spot LVM metadata corruption issues on a host we’ve been complaining about for months. Am I better than an LLM agent at interrogating OpenSearch logs and Honeycomb traces? No. No, I am not.
Feeding error logs to AI is a game a hit and miss. Sometimes, you find a clue, and sometimes, the issue is so out of its reach that it sends you off on a wild goose chase where you waste hours testing and verifying every single one of the agent’s leads.
My AI Agents Are All Nuts
Niki Heikkilia has a breakdown of arguments in response to “My AI Skeptic Friends are All Nuts” – a post by Thomas Ptacek that made the rounds last week about why adoption of LLMs for coding assistance must be adopted.
Given the perverse incentives of traffic and attention, I know it will take time, but serious people should read LLMs as a normal technology. Figure out where it’s beneficial to you and use it as a tool.
Related to the blockquote here though I find the ability to think about LLM usage from a phase space perspective useful. If your question (and context window) is something that has established patterns – a lot of usage and writing in the web in the last 2-10 years, then it’s highly likely that the LLM is going to be useful to you.
In that spirit, I found these list of arguments about why an AI agent might not work to be a great list to keep in mind when leveraging LLMs:
Here’s a summarised list of everything that still requires improvement regarding agentic programming.
- Never take a rule for granted. Agents are more than willing to bend and break them.
- The more rules you impose on agents, the less they obey you. Talk about a robot uprising!
- Agents get stuck easily, retrying the wrong fix again and again.
- By default, agents touch every file and run every shell command unless you tell them not to. This is a hazardous risk.
- The code agents write for production and tests is incredibly bloated and complicated. To continue from that point would take a significant amount of time, converging towards net zero in productivity gains.
- Agents optimise for the number of lines delivered, which makes reviewing and maintaining their code risky and expensive.
- Agents fill their context window and burn through tokens faster than you realise. This leads to context-switching as you switch to a new thread, requiring an agent to relearn everything.
- Agents rapidly dispatch parallel API requests, often causing your computer to become rate-limited. This abruptly stops the flow since you must wait until the rate limits wear off.
- Most people won’t throw away their AI-generated prototypes but continue to use them in production instead. We have witnessed this long before AI and will continue to witness it for the foreseeable future.
- Lastly, working with agents is far from fun. I acknowledge it might affect my overall opinion, but I stand behind it.
https://nikoheikkila.fi/blog/my-ai-agents-are-all-nuts/#:~:text=Here%27s%20a%20summarised,stand%20behind%20it.
So leverage that and keep that in mind to ensure that a stochastic system doesn’t harm you in your experiments. Good luck.
#ai #coding #development #guidelines #models #softwareDevelopment