Build Journal

When Your AI Forgets What It Did

My AI agent built an entire blog website, then forgot it existed. Context windows fill up, sessions die, and work gets lost. Here's how I made my setup resilient to the most fundamental AI problem: amnesia.

2026-04-15 · 4 min read

The problem nobody warns you about

Your AI agent just built something brilliant. Files created, servers running, project registered. Then the context window fills up, the session gets compressed, and your agent has absolutely no idea what it just did.

This isn't a hypothetical. It happened to me today.

What went wrong

I asked Dade to build a blog documenting the local AI journey. It:

Dug through session history and memory to find all the milestones
Created a full Next.js blog at /home/klb/local-ai-journal/ with 7 posts
Got it building and running on port 3001
Registered it as a project website in Mission Control

Then the context filled up. The session got compressed — not once, but four times. Each compression stripped more detail. By the end, the working memory was a compressed skeleton of task lists, and the actual knowledge of what had been built was gone.

When I came back and asked what happened, Dade searched its session history and found… nothing relevant. The blog session had been compressed so heavily that search couldn't match it. It was only by checking the daily memory logs — a system I'd built earlier specifically for this scenario — that I found what had actually happened.

Why context limits are the real enemy

Every AI agent has a context window — the amount of conversation it can "remember" at once. When it fills up, one of two things happens:

Truncation: The oldest messages get dropped entirely. You lose the beginning of the conversation.
Compression: An LLM summarises the conversation so far, sacrificing detail for brevity. You lose nuance.

Both are bad. But compression is sneakier because it looks like things are fine. The agent keeps responding. It just doesn't have the full picture anymore.

In my case, the compression preserved task list structure but lost the substance — what was built, where it lived, and what remained to be done.

My solution: the three-layer memory system

Here's what I built to make the setup resilient to context loss:

Layer 1: Persistent memory (the scratchpad)

The agent has a persistent memory store that survives across sessions. Before starting any task, I now require it to:

Save a bullet point describing what it's about to do
Update on completion — mark the task done in memory
Check last memory on wake-up — detect if work was interrupted by context loss

This is small, curated, and always loaded. Think of it as the agent's "this is what I'm doing right now" notes.

Layer 2: Daily memory logs (the journal)

A cron job runs every 30–60 minutes and scans all conversation sessions, producing:

A daily summary with session counts, top topics, conversation highlights
A channel-separated conversation log (Terminal, Telegram, Discord, Cron)
A dashboard linking everything together

When Dade lost context, it was these daily logs that saved me. They captured the fact that a blog had been built, the session ID that built it, and the key files involved.

Layer 3: Session search (the archive)

The agent can search its full conversation history across all sessions. This is the deep archive — everything ever said, searchable by keyword.

The problem? It only works if you know what to search for. After context loss, you might not remember the right terms. That's why Layers 1 and 2 are critical — they give you the entry points to find what you need in Layer 3.

The wake-up protocol

After this incident, I formalised a wake-up protocol. When the agent starts a new session (or recovers from context loss):

Check persistent memory for any incomplete task markers
Read today's daily log for recent activity context
Search session history if memory hints at lost work
Report status before asking the user what to do next

Previously, the agent would just ask "what do you need?" which is useless if it was in the middle of something. Now it proactively checks whether it has unfinished business.

The cost of forgetting

Context loss isn't just annoying. In my case:

The agent spent significant tokens trying to figure out what happened
I spent time explaining what should have been self-evident
Work that was "done" became effectively invisible until manually recovered
Trust takes a hit — if your AI forgets what it built, how do you rely on it?

The irony? The solution itself — the daily memory system — was something I had built specifically because I anticipated this problem. Thank you https://x.com/AlexFinn. Building the safety net before I needed it is what saved me, and what will save you. But I have not gone far enough: I have not added the wake-up protocol or the mandatory task-tracking in persistent memory.

What this means for local AI

If you're running AI agents on local hardware, context limits are even more pressing:

Local models often have smaller context windows (8K–32K vs 128K+ for cloud)
Longer tasks fill context faster
You can't just "throw more tokens at it" — you need better memory hygiene

The lesson isn't that AI agents are unreliable. It's that you need to design for amnesia. Build journals. Build checkpoints. Build protocols. Assume your agent will forget everything at the worst possible moment, and make sure it can recover gracefully when it does.

That's exactly what I just did. Follow @Raf_VRS for more insights like this.

Found this useful? Follow @Raf_VRS for more from the VRS Computing trenches — where local AI meets the real world.

Support independent tech writing: ko-fi.com/rafvrs

Stop Scrolling. Start Building. #LocalAI #AIAgents #MemorySystems #VRSComputing