Context Compact | Learn Agent

Context will fill up; you need a way to make room

What You’ll Learn

Why unbounded context growth breaks agents
How to implement a three-layer compaction strategy
How to preserve decision rationale while discarding verbatim output

The Problem

After 100 tool calls, the messages array holds thousands of tokens of stale bash output. The model runs slower, hits context limits, and loses the thread.

The Solution

Three-layer compaction: summarize long tool outputs, archive old turns as structured notes, and keep the most recent turns verbatim.

Layer 1 (recent):  Last N turns, verbatim       --> model can see
Layer 2 (compressed): Old turns as JSON summaries --> injected as context
Layer 3 (archived): Everything                    --> written to disk

How It Works

Track token usage. When it exceeds a threshold, trigger compaction.
Summarize old tool results: replace 5000 lines of test output with "tests/ passed 42/42".
Inject the compressed context as a system-like message.

def compact(old_messages, new_messages):
    summary = summarize_turns(old_messages)
    return [
        {"role": "user",
         "content": f"[COMPACTED CONTEXT]\n{summary}"},
        *new_messages,
    ]

What Changed From s05

Component	Before (s05)	After (s06)
Context	Grows unbounded	Three-layer compaction
Token tracking	None	Threshold-based trigger
Archiving	None	Structured summaries

Try It

cd learn-claude-code
python agents/s06_context_compact.py

Read every Python file in this project and tell me what's wrong
Run the test suite 10 times and summarize the results
Explore the entire codebase and create an architectural overview

Key Takeaway

Compaction isn’t deleting history — it’s relocating detail to make room for the agent’s next thought.