We ran something similar for a browser automation project - multiple agents working on different modules in parallel with shared markdown specs. The bottleneck wasn't the agents, it was keeping their context from drifting. Each tmux pane has its own session state, so you end up with agents that "know" different versions of reality by the second hour.

The spec file helps, but we found we also needed a short shared "ground truth" file the agents could read before taking any action - basically a live snapshot of what's actually done vs what the spec says. Without it, two agents would sometimes solve the same problem in incompatible ways.

Has anyone found a clean way to sync context across parallel sessions without just dumping everything into one massive file?

This maps closely to something we've been exploring in our recent paper. The core issue is that flat context windows don't organize information scalably, so as agents work in parallel they lose track of which version of 'reality' applies to which component. We proposed NERDs (Networked Entity Representation Documents), Wikipedia-style docs that consolidate all info about a code entity (its state, relationships, recent changes) into a single navigable document, corss-linked with other other documents, that any agent can read. The idea is that the shared memory is entity-centered rather than chronological. Might be relevant: https://www.techrxiv.org/users/1021468/articles/1381483-thin...

I’ve been using Steve Yegge’s Beads[1] lightweight issue tracker for this type of multi-agent context tracking.

I only run a couple of agents at a time, but with Beads you can create issues, then agents can assign them to themselves, etc. Agents or the human driver can also add context in epics, and I think you can have perpetual issues which contain context too. Or could make them as a type of issue yourself, it’s a very flexible system.

[1] https://github.com/steveyegge/beads

Beads has been on my list to try. I can see it being a natural evolution of my setup

I avoid this with one spec = one agent, with worktrees if there is a chance of code clashing. Not ideal for parallelism though.

The worktree approach is interesting - keeps the filesystem separation clean. The parallelism tradeoff makes sense if the tasks are truly independent, which in practice is most of the time anyway.

What does your spec file look like when you kick off a new agent? Curious if you start from scratch each time or carry over context from previous sessions on the same project.

I describe this in the article - I mostly kick off a new agent per spec both for Planners and Workers. I do tend to run /fd-explore before I start work on a given spec to give the agent context of the codebase and recent previous work

I've been building agent-doc [1] to solve exactly this. Each parallel Claude Code session gets its own markdown document as the interface (e.g., tasks/plan.md, tasks/auth.md). The agent reads/writes to the document, and a snapshot-based diff system means each submit only processes what changed — comments are stripped, so you can annotate without triggering responses.

The routing layer uses tmux: `agent-doc claim`, `route`, `focus`, `layout` commands manage which pane owns which document, scoped to tmux windows. A JetBrains plugin lets you submit from the IDE with a hotkey — it finds the right pane and sends the skill command.

For context sync across agents, the key insight was: don't sync. Each agent owns one document with its own conversation history. The orchestration doc (plan.md) references feature docs but doesn't duplicate their content. When an agent finishes a feature, its key decisions get extracted into SPEC.md. The documents ARE the shared context — any agent can read any document.

It's been working well for running 4-6 parallel sessions across corky (email client), agent-doc itself, and a JetBrains plugin — all from one tmux window with window-scoped routing.

[1] https://github.com/btakita/agent-doc

The "don't sync, own" model makes a lot of sense. We were thinking about it wrong - trying to push state out to a shared file, when the cleaner move is to pull it in on demand.

The SPEC.md as the extraction target after a feature is done is a nice touch. In our case the tricky part is that browser automation state is partly external - you have sessions, cookies, proxy assignments that live outside the codebase. So the "ground truth" we needed wasn't just about code decisions but about runtime state too. Ended up logging that separately.

Checking out agent-doc, the snapshot-based diff to avoid re-triggering on comments is clever. Does it handle cases where two agents edit the same doc around the same time, or is the ownership model strict enough that this doesn't come up?