// Externalize your memory · lesson 01

The context window fills, and then it forgets

updated 2026.07.05// field-manual

The foundations track covered why a model degrades in a long conversation. This track is about what you do about it, and it starts with accepting the constraint as a design fact rather than a surprise: the context window is finite, it fills as you work, and the fuller it gets the less reliable the model becomes. If you build as though the model remembers everything, the forgetting will find you at the worst moment.

Here is the practical shape of it. Early in a session the model has room and its attention is sharp. As the conversation grows, two things happen at once. You approach a hard limit, beyond which the oldest content simply falls out of the window and is gone. And well before that limit, attention thins, because the model is spreading a fixed budget of focus across an ever-larger pile of tokens, so the middle of a long context gets vague even while it technically still fits. Both mean the same thing for you: the memory you are counting on is quietly decaying as you work.

What does accepting this constraint change about how you build?

It flips your default from "the thread remembers" to "I own the state, the thread is disposable." Once you truly internalize that the window will fill and forget, you stop trusting it with anything you cannot afford to lose. Important decisions, the current spec, the state of the build, none of that lives only in the chat, because the chat is a workspace with a leak in it. It lives in files you control, and the thread becomes a place you do work, not a place you store truth.

This is the reframe the whole track hangs on. People treat the conversation as memory because it feels like memory in the moment, it is right there, you can scroll up. But scrolling up is you doing the remembering, not the model. Inside the window, the model's grip on the early material is already loosening, and past the limit it is not there at all. The chat is short-term working memory that degrades under load, and treating it as long-term storage is the root mistake this track exists to fix.

Everything that follows, the files, the save points, the curation, is a response to this one fact. The window fills and forgets, so you build a memory that does not.

The takeaway: A model's working memory is finite and degrades as it fills, so treat the thread as disposable workspace, not storage. Own your state in files, because the window will forget exactly when you need it most.