Context engineering · Agent handoff

Autocompaction is not memory

Built-in context compaction helps an individual agent survive a long chat. A local handoff control plane helps a team preserve the operational state that the next agent needs to act safely.

Published May 26, 2026 · Gregory Shevchenko · Humanswith.ai

Summary: Autocompaction is a session-level compression feature, not a durable memory system. Larger context windows and automatic chat summaries delay context failure, but they do not make state portable, auditable, or shared across Claude Code, Codex, Cursor, Windsurf, a remote Mini, and background MCP workflows.12

What built-in autocompaction does well

Long-running coding agents already summarize. That is useful. A product-level compaction pass can keep a conversation moving after the raw context becomes too large.

But the summary is usually local to one product session. It is optimized for keeping the current conversation alive, not for handing the work to a different agent surface with approvals, risks, and exact values intact.

That distinction matters because the failure is rarely “the agent forgot everything.” The practical failure is narrower and more dangerous: the next agent remembers a plausible story, but loses which branch is active, which URL is canonical, which credential must not be printed, which proof already passed, or which user decision was still pending.

The missing layer is operational handoff

A team workflow needs something stricter than a narrative summary after context pressure is already high. It needs a handoff contract that preserves what the next agent must know before it acts.2

For us, that means a local handoff MCP that can preserve:

  • the objective and done condition;
  • loaded instructions and constraints;
  • approval state;
  • exact values that must not drift;
  • risks and red flags;
  • actions already taken;
  • errors and fixes;
  • the next recommended step;
  • what the next agent should not redo.

This is not “better summarization” as a writing task. It is a control-plane problem. The handoff should be compact enough to fit into the next session, structured enough to be checked, and explicit enough that a fresh agent can continue without re-reading the entire chat.

Autocompaction vs local handoff MCP

Capability Built-in autocompaction Local handoff MCP
Scope Usually one product session Shared workspace protocol
Timing Often after context pressure is high Pre-score before the window is full
State Narrative summary Operational contract with approvals and risks
Portability Product-specific Claude Code, Codex, Cursor, and Windsurf vocabulary
Measurement Usually opaque Resume success, re-read rate, token estimate, and leak rate

Why 1M context is not enough

A 1M context window is valuable. It lets the agent keep more code, logs, source material, and prior reasoning available before it needs compression.

But it does not solve the coordination problem. A larger window delays the failure mode. It does not automatically mark which facts are trusted, which approvals were granted, which values must not change, or what a fresh agent should avoid redoing.

The real product is not compression. It is continuity. Long context is storage. Memory is the ability to resume the right work, with the right constraints, through the right proof loop, without accidentally changing the operating rules.

That is why I treat handoff design as part of agentic engineering rather than note-taking. If Claude Code, Codex, Cursor, Windsurf, n8n, and MCP services all participate in one workflow, the shared memory layer has to be legible to every surface that may pick up the task.2

The mechanism I want in the workflow

  1. Pre-score before the context window is full.
  2. Trigger red flags when approvals, secrets, live ops, or exact values are at risk.
  3. Write a structured handoff contract, not a vibe summary.
  4. Resume with a blind gate: can a fresh agent continue from the handoff alone?
  5. Measure resume success, re-read rate, token estimate, and leak rate.

The important part is the gate, not the prose. A good handoff should answer: can a new agent identify the objective, avoid forbidden actions, preserve exact values, choose the next safe step, and run the right proof loop without asking the human to reconstruct the past hour?

How do you test whether memory works?

I do not want to judge this by vibe. A handoff should be tested the same way a production workflow is tested: with a fresh run, a clear acceptance bar, and evidence that the failure mode did not repeat.3

The simplest test is a blind resume check. Give a fresh agent only the handoff, the repo, and the allowed tools. If it can continue the work safely, the handoff is doing operational memory. If it has to re-read the entire previous session, guess user intent, or ask for already-decided facts, the handoff is still a summary.

For content and AI Search workflows, the same logic applies. A canonical page, its distribution map, source list, schema state, and next publishing action should survive a handoff without the next session accidentally breaking canonical-first logic.4

Where this matters most

Autocompaction feels like an agent productivity feature. In practice, it becomes a business reliability feature once agents touch live websites, publishing workflows, analytics, credentials, deploys, or external platforms.

The risky moments are predictable:

  • the agent is about to deploy, purge cache, or submit indexing;
  • the task spans several public surfaces such as Medium, LinkedIn, DEV.to, Habr, X, and the canonical site;
  • the conversation contains credentials, tokens, payment decisions, or irreversible admin actions;
  • the quality bar depends on proof artifacts rather than a subjective “looks good”;
  • a new chat or another IDE must continue the same task without drifting from the original plan.

Those are not contexts where “the model probably remembers enough” is an acceptable control. The handoff needs the same discipline as any other agent gate: explicit contract, deterministic checks where possible, and a stop condition when confidence drops.

Frequently asked questions

Is autocompaction bad?

No. Autocompaction is useful for keeping a long conversation alive. The problem starts when teams treat it as durable operational memory across tools, agents, branches, approvals, and deployments.

Why is a 1M context window not enough?

A larger context window stores more material, but it does not automatically label what is trusted, approved, risky, stale, or already completed. Memory needs structure and proof, not only capacity.

What is a local handoff MCP?

It is a local control-plane service that preserves task state in a format another agent surface can consume: objective, constraints, loaded instructions, decisions, exact values, risks, completed work, proof, and the next safe action.

What should a handoff preserve?

At minimum: the done condition, approval state, active branch or surface, files touched, live URLs, credentials boundaries, proof commands, failed attempts, and the “do not redo” list.

How do you test a handoff?

Run a blind resume check. A fresh agent should continue from the handoff without seeing the full prior chat, preserve the constraints, and pass the same deterministic gates.

How does this relate to AI Search and content workflows?

Canonical-first publishing depends on continuity. The next agent must know which site page is the source of record, which distribution links are already live, which schema changed, and which indexing or visible-link gates remain.

Sources

Republished on Medium

Read and share the Medium.com version

Discuss on LinkedIn

Read the LinkedIn.com post and join the thread

Thread on X.com

Read the X.com thread

Developer cross-post on DEV.to

Read the DEV.to cross-post

Related