What happens if we swap AI brains?
Despite my initial skepticism, I’ve been increasingly using LLM-based coding
assistants to get shit done. No vibe coding, mind you — I am too much of a
control freak for that, but letting the machine do the tedious parts of coding
has been great for me. For personal use, I particularly enjoyed using Claude
Code (enough to shell out for a Pro subscription): I don’t have to talk to it
like to a lawyer capricious genie that wants to fuck me over on the
slightest slip of instruction.
I also got to use and compare several such tools, which led me to a hypothesis:
The interface of the agent — the tool that invokes an LLM — defines its usefulness as much, if not more, as the model behind it.
More specifically, prompts, instructions and tools that are made available to the LLM can make a difference between frustrating baby-sitting and a productive coding session. However, until recently I had no good way of testing it because most frontier LLMs are coupled with their own proprietary tool and it’s hard to separate the influence of the tool from the model behind it.