A couple of weeks ago I published a post with a tidy rule in it. When you add capability to an AI coding agent, reach for the lightest option first: a procedure file before a CLI, a CLI before a heavier integration, and only build the heavy machinery once you've proven you'll reuse it. My whole case rested on context cost. The heavy options load a lot of definitions up front and carry them every turn, so starting light keeps the window clean.
I still think the front half is right. But it isn't the rule I'd write now, because a reader took it apart in the comments and handed it back as something better. This post is about that exchange, because the rewrite was sharper than my original, and pretending I arrived at it alone would be both a lie and the less interesting story.
The hole, found in one comment
The first comment didn't argue with the rule. It walked straight to the blind spot. The moment a tool touches anything external or stateful, lightest-first reverses on you: a lightweight call that fails silently halfway through is harder to debug than a heavier tool that surfaces the failure cleanly. Pay the complexity up front.
My first instinct was to defend, and I did, a little. I said we were measuring different things, that I'd optimized for context cost while they were optimizing for failure observability, both real, different axes. I held the line by pointing out you can wrap a lightweight call to fail loudly, so the cheap path stays open.
That was true, and it was beside their point, and they didn't let me hide behind it.
The question that moved the rule
They asked one question that did more work than my entire post: what's your actual trigger for paying the complexity up front, the type of state, or the class of error?
Sitting with that is where my own rule changed under me. The honest answer is state type, and the moment I said it out loud, context cost stopped being what the rule was about. What makes a failure expensive isn't the error. It's whether the operation changed something you can't take back before it died. A silent failure in a read is an annoyance. The same failure in the middle of a write is a half-changed world you now have to reconstruct.
So the real trigger was never the tool, and never even the error. It was reversibility. If a partial failure leaves something committed you can't cleanly roll back, that's where the heavier, structured-failure tool earns its weight, no matter how cheap the light version looked on context. Stateless or idempotent work stays light and is allowed to fail loud. Mutating, non-idempotent work gets the clean failure surface up front.
Then they put the landing better than I had managed in two thousand words: most error classes look recoverable until you ask whether anything has been written, and at that point the taxonomy is a distraction and you're just triaging state. I've been quoting that line to myself ever since. It's theirs, not mine, and it's the cleanest thing in this whole piece.
Same three words, a different engine
Here's what caught me off guard. The rule I published was "lightest first, on context cost." The rule I kept was "lightest first, until a partial failure can leave state you can't reverse." Identical on the front, completely different engine underneath.
Context cost is recoverable. Connect a tool, measure, disconnect it later, no harm done. An unreversible write that failed silently at step three is not recoverable like that. You find it after the damage. So the two axes I'd called equal at the start were never equal, and the thread had quietly walked me off the weaker one and onto the stronger one without my noticing it happen.
We added one more layer near the end, and it's the part I think about most now. Even past your reversibility threshold, there's a cost that doesn't show up at error time. The action might be technically reversible, but trust often isn't. Roll the database back all you want; the person who watched the agent get something wrong across a boundary they cared about doesn't restore their confidence on the same schedule. That cost lands later, a couple of sprints on, when someone quietly starts hand-checking everything the agent produces. No rollback refunds it. I didn't have that idea when I hit publish. It exists because someone kept pushing after I thought we were done.
Why I left the original post alone
I could have folded all of this into the original and moved on. I didn't, because the corrected rule isn't the interesting artifact. The correction is.
A clean rule reads like it arrived fully formed. Mine didn't. It had a real hole, a reader found it in one comment, and four or five exchanges later it was load-bearing in a way it simply wasn't when I shipped it. None of that shows if I quietly swap the conclusion and tidy up after myself. The seams are the useful part. They mark exactly where a confident-sounding rule was soft, and they show the kind of question that applies enough pressure to harden it. Sanding them off would hide the one thing worth seeing.
It also reset what I think publishing is for. I used to treat a post as something I finish, then defend in the comments. This one worked better as something I started and let someone else finish. The comment section wasn't an audience reacting to a conclusion. It was the second half of the draft, written after publication, by a collaborator I didn't know I had and didn't recruit.
What I'm keeping
Two things, and only one of them is about tools.
The tool one: lightest first is still my default, but the trigger for going heavy is reversibility, not context, and there's a second filter for trust that no rollback covers. That's a better rule than the one I published, and the better half of it isn't mine.
The other one is harder to admit. The most useful idea in my own post arrived after I'd published it, from someone who owed me nothing, in a thread I could easily have treated as noise to defend against. If I'd shipped the rule airtight, nobody would have found the seam, because there wouldn't have been one to grab. The hole was the invitation. So I've stopped trying to publish things that can't be argued with. A clean rule with a visible flaw, put where strangers can reach it, pulls in angles I don't have and corrections I can't reach alone. Ship it before it's airtight. Leave the seam showing. The version that comes back is usually the one worth keeping, and if you're honest, it's usually not entirely yours.
I build WordPress plugins and write about AI tooling and security at https://raplsworks.com/.
Top comments (4)
The meta-story here is as interesting as the object-level lesson.
Most authors would have either (a) quietly updated the post without attribution or (b) ignored the comment entirely. Writing a whole post that says "this reader was right and I was less right" is a different kind of move — and it's exactly the kind that builds actual trust with an audience, because it demonstrates the author's relationship with being wrong.
The rewritten rule is genuinely better: side-effect risk as the primary axis rather than context cost. The original rule works for pure reasoning tools but breaks on tools that touch external state, which is most of the interesting use cases. The side-effect-first framing also generalizes — it's why stateless tools can be lighter-weight even if they're technically more capable.
Curious whether the rule holds for agentic loops where the same tool might be called in stateful and stateless contexts depending on the task. Does your reader's version handle that, or does it still assume you can classify the tool's behavior at selection time?
Good question, and it's the spot where the rule needs tightening. Side-effect risk isn't really a property of the tool, it's a property of the call:
run shellorfetchis read-only in one invocation and mutating in the next, so classifying behavior at selection time doesn't hold. In an agentic loop I'd move the axis from the tool down to the call (tool plus args plus context) and gate per call, confirming only on the branches that change state. The tool stays selected; the caution moves to the moment it actually does something.That reframe lands cleanly — the call is the right unit, not the tool. It also suggests that in tight loops you'd want the confirmation step to happen as close to the mutation as possible, not at planning time when context is still forming. Gate late, not early.
Makes me wonder whether the pattern holds when the same agent re-enters a loop and the "same" call now carries different accumulated state. The tool is identical; the risk profile has shifted.
"Gate late, not early" is the right correction, and it follows straight from the call being the unit. If risk is a property of the call and not the tool, then the only honest moment to evaluate it is when the call's actual arguments exist, which is right before the mutation, not at planning time when you're still holding intentions instead of values. Confirming early just means confirming a guess about what the call will be.
Your re-entry case is the sharper half, and I think it breaks the "same call" assumption entirely. The call isn't the same call. Identical name, identical signature, but the thing that determines reversibility was never the signature, it was the state the call lands in. Second time through the loop, the world it's mutating has already been mutated once, so the same delete or write or upsert now sits on top of accumulated state, and a partial failure rolls back to a different place. So the rule has to gate on the call plus the state it's about to touch, evaluated fresh each iteration, not memoized from the first pass. The tool is identical, the risk is recomputed every time, because reversibility is a fact about the world at call time, not about the code. Good push, this is the part the original rule waves past.