// When it breaks, triage before you rebuild · lesson 07

The glitch class, or when the tool fails mid-generation

Not every failure is a bug in your system. Some are a failure of the tool doing the building, and confusing the two will waste hours. There is a whole class of problems where the AI glitches mid-generation, and the fix is not to debug your code, it is to change how you are generating it. Learning to recognize this class is its own triage skill.

Here is the pattern I hit repeatedly. When I had an AI editing code directly in a chat interface, generation would degrade partway through, especially on longer outputs. Around the point where the response got large, the quality would fall off: truncated edits, a change that half-applied, a file left in a broken in-between state. For a while I treated each instance as a bug in the code and tried to debug it. It was not a code bug. It was the generation process itself breaking down under its own length, and no amount of debugging the output fixes a broken production process.

How do you recognize a tool failure versus a real bug?

The tell is inconsistency that tracks the process, not the logic. A real bug is deterministic: same input, same wrong output, because the logic is wrong in a stable way. A glitch is erratic and correlates with the generation, not the problem. It shows up on long outputs, it changes between runs of the identical request, it leaves things half-done in a way that no logic error would produce. When the failure looks like the tool ran out of steam rather than computed the wrong answer, you are looking at the glitch class.

The fix is not a better prompt or more debugging. It is changing the generation method to one that does not have the failure. For me that meant moving code changes out of the freeform chat and into delegation: instead of having the model edit files live and inline, I write a precise instruction, an exact file, an exact change, an exact test, and hand it to a tool built to execute edits reliably. The glitch class disappears not because I fixed a bug but because I stopped using the method that produced it.

This is why one of my hard rules is to never let the model edit production code directly inline. It is not superstition. It is a triage conclusion, reached by recognizing that a specific class of failure came from the process, not the code, and routing around the process instead of fighting the symptoms.

The takeaway: Some failures are the tool glitching mid-generation, not bugs in your code. They're erratic and track output length, not logic. Fix them by changing the generation method, like moving to delegation, not by debugging harder.