The Pain
You had two clean ideas and one afternoon, so you ran them at once. One agent was refining the demand-elasticity prep; the other was building the trip-duration robustness checks. Same repository, because it was faster, and what could collide — they were working on different parts of the analysis. For twenty minutes nothing did.
Then both reached for src/build_panel.py, because both needed the
panel and neither knew the other existed. The first agent rewrote the
zone join; the second, a beat later, rewrote the same function for
duration weighting and saved over it. Git, asked to hold two
incompatible edits to one file from two writers who never spoke,
did the only thing it could: it thrashed. A merge conflict you did not
author, in a file neither of you finished, while a third write — a
results file from the first run — landed on top of the second run’s
half-written output. You spent the rest of the afternoon not doing
either analysis but disentangling them, reading diffs to reconstruct
which agent meant what, and you got it wrong once and had to do it
twice.
A real lab does not seat two researchers at one desk and one keyboard. Each gets their own workspace, their own copy of the shared materials, and the work is combined deliberately, by someone whose job is to combine it — not by collision. The parallelism was never the problem. The shared desk was.
Why / When
The moment two agents work the same repository at once, they trample each other: half-written transforms collide, results overwrite, git thrashes on edits no single author made. The mechanism a lab uses to prevent this is the same one a version-control system already offers — a separate working directory per worker, backed by the same shared history, combined through ordinary review. Each agent gets its own checkout; nobody writes where anybody else is writing; merges happen on purpose, by a human acting as referee.
This is the load-bearing mechanic for everything at scale that follows: the overnight runs in D3 each want their own tree so a 3 a.m. failure is a deletable directory, and the fleets in D4 want isolation per code-touching variant. It earns its own lesson because getting it wrong is not a style problem — it is the lost afternoon in the Pain vignette, and it scales with the number of hands. The discipline accelerates nothing on its own; what it does is make parallelism safe, which is the only thing that makes parallelism worth doing.
Contrary winds
Not for: agents that only ever write results — never code — under a shared contract: a manifest with one-file-per-run rules lets them share one tree safely, and a worktree each is then just ceremony (the D4 pattern).
Mechanics
Field note
There is nothing language-specific here: worktrees are a git mechanic, and the agents inside them may write Python or R without changing a word of this page. That is why it declares no R variants.
The mechanic
A git worktree is a second working directory attached to the same repository: one shared object store and history, but each worktree has its own checked-out files and its own branch. You create one with a single command, and the agent (or session) that works there cannot touch another worktree’s files because they are literally a different directory:
# from the main checkout, give each workstream its own tree + branchgit worktree add ../weather-mobility-w1 -b w1-elasticitygit worktree add ../weather-mobility-w2 -b w2-duration
git worktree list # the main tree plus the two new desks# … work happens in each independently; combine through review:git switch main && git merge w1-elasticity # deliberate, reviewedThis beats the obvious alternative — copying the whole folder twice —
on every axis that matters: the worktrees share history, so a
commit in one is visible to all and there is no re-syncing; they are
cheap, sharing the object store rather than duplicating it; and they
force a disciplined merge, because combining work means a real git
merge a human reviews, not a file-copy nobody audited. The one case
where you do not need them is the notFor above: agents that only write
results under a contract never touch shared code, so they can share one
tree — the manifest, not the worktree, is doing the isolation there
(D4).
Step 1 of 7.
You want two agents working at once — A on the cleaning transforms, B on the figures. The obvious move is to point both at the same checkout and let them go. One repo, one working tree, two sets of hands.
The two tools reach the same isolation by different routes — one through local primitives you compose, one through a managed multitasking model — so this is a dual treatment, not a tab. Neither hides; the contrast is instructive.
Claude Code Your tool
A session per worktree, and agents spawned into one
The recommended pattern is one claude session per worktree: open the
elasticity tree in one, the duration tree in another, and they cannot
collide because each is rooted in a different directory. The desktop app
runs these as parallel sessions across worktrees side by side.
Beyond hand-driven sessions, a subagent or workflow can be spawned directly into a fresh worktree — the orchestrator creates the tree, runs the agent there, and auto-cleans the worktree if the agent left it unchanged, so a survey that produced nothing leaves no litter. This is the primitive D4’s fleet stands on: each code-touching variant gets its own worktree, created and disposed of by the workflow, isolated from the others by construction. The composition — worktree, plus agent, plus auto-cleanup — is something you assemble from pieces, which is the Claude Code shape throughout.
Codex Your tool
Parallel threads natively, and cloud tasks as the managed worktree
The desktop app’s core model is multitasking: it runs parallel
threads, each backed by its own worktree natively — starting a second
thread on a second workstream gives it an isolated checkout without your
asking, because that is the app’s central abstraction rather than a
pattern you assemble. CLI users get the same isolation the ordinary way:
a plain git worktree per thread.
The managed analogue goes one step further. A cloud task runs in an isolated cloud environment — its own container, its own branch — and returns a diff when it finishes. That is a worktree you never have to create, clean, or even keep on your laptop: the isolation is the service’s, and what comes back is reviewable exactly like a pull request. The trade is that the isolation is real but opaque — you review the returned diff, you do not watch the desk — which is the managed-delegation shape throughout.
| Intent | Claude Code | Codex |
|---|---|---|
| two workstreams at once, locally | one session per worktree (desktop: parallel sessions across worktrees) | parallel threads, each its own worktree natively (CLI: git worktree per thread) |
| an agent isolated for a code-touching task | subagent/workflow spawned into a fresh worktree, auto-cleaned if unchanged | a cloud task — isolated container + branch, returns a reviewable diff |
| combining the work | deliberate git merge, human as referee | review the returned diff like a PR, then merge |
Worktree discipline for analysis projects
The mechanic is cheap; the discipline is what keeps it honest, and it is the same in either tool:
- Branch by workstream.
w1-elasticity,w2-duration— the branch name says which analysis it carries, so the merge referee knows what they are combining before they read a line. - Know what merges and what never does. Code and specs merge — they
are the shared methodology. Scratch outputs do not: a results file
is owned by a contract (D4), not reconciled by a git merge. Merging two
agents’
results/is how you get a file that is neither run, and it is the second collision in the Pain vignette. - Keep
results/out of worktree merges. The D4 results contract governs result files; the git merge governs code. Conflating them re-creates exactly the overwrite you used worktrees to prevent. - The human is the merge referee. Parallelism is safe only because someone deliberately decides what combines. The tool isolates; you reconcile. Never let a merge happen by collision instead of by decision.
Guided Run — One Repo, Many Hands: a desk per workstream
git worktree add ../weather-mobility-w1 -b w1-elasticityGuided Run — One Repo, Many Hands: a thread per workstream
git worktree add ../weather-mobility-w1 -b w1-elasticityField Assignment
Artifact make check-d2 passes — both branches merged, history linear per workstream, zero collisions
Run the project’s two workstreams concurrently — and prove they never touched each other.
- Give each workstream its own worktree and branch:
w1-elasticityfor the demand-elasticity prep,w2-durationfor the trip-duration robustness prep. - Run both at the same time — per your tool below — one agent refining the elasticity prep, one building the duration prep, each rooted in its own tree.
- Merge both branches back to
maindeliberately, as the referee: code and specs merge; noresults/file is reconciled by the merge. - Demonstrate zero collisions: each workstream’s history is linear, and no file was overwritten across trees.
Claude Code
Open one claude session per worktree (or spawn a worktree-isolated
agent per workstream). Let them run simultaneously; confirm neither
session can see the other’s working files. Merge w1-elasticity then
w2-duration into main, reviewing each diff as the referee.
Codex
Run the two workstreams as parallel threads (each its own worktree), or
hand one to a cloud task and review its returned diff. Confirm the
threads never share a working directory. Merge both branches into
main, reviewing each diff — the cloud task’s exactly like a PR from a
new student.
make check-d2 verifies both branches merged cleanly and that each
workstream’s history is linear — no cross-tree overwrite, no merge you
did not author. This is the mechanic D3 runs its overnight jobs inside
and D4 fans its fleet across.
make check-d2advances D2Branch by workstream so the merge referee knows what they're combining.
Merge code and specs; let the contract own the outputs.
Check each item only once it is true of YOUR repo — the gate is self-certified, like the rest of your methodology.
Pitfalls & Gotchas
- [both]
〜〜
“Two agents in one tree, just this once.” The collision does not happen the afternoon you decide it is fine — it happens the afternoon it matters, on the file you cared about, and you spend the day reconstructing which writer meant what. For a result you intend to publish, an unaudited overwrite is not an inconvenience; it is a number you can no longer explain. The worktree is one command and it is the difference.
- [both]
Merging scratch outputs. Results belong to contracts (D4), not to git merges: reconciling two agents’
results/produces a file that is neither run. Merge code and specs; let the contract own the outputs. - [CC]
Worktree-spawned agents that mutate global state — installed packages, shared caches, a global config — escape the isolation the worktree gave them, because that state lives outside any tree. Keep environments per-worktree (B2’s pinned lockfile, restored inside each tree) so the isolation is real and not just file-deep.
- [CX]
Cloud-task isolation is real but opaque: you do not watch the work, you receive a diff. Review that diff like a pull request from a new student — line by line, asking what it touched and why — not like a trusted teammate’s. Opaque isolation only protects you if you read what comes back.
Check Your Bearings
This check opens when the guided simulation above is complete — the questions assume you have seen the run.
(noted in your field journal as an override)Field journal
Parity note
This is a Tier-2 split: both tools deliver true per-worker isolation, but by different primitives. Claude Code composes it from local pieces — a session or a spawned agent per git worktree, with auto-cleanup of unchanged trees — the assemble-it-yourself shape. Codex makes parallel worktree-backed threads the desktop app’s native model and offers cloud tasks as a managed worktree you never create, clean, or hold locally, returning a reviewable diff — the managed-delegation shape. The underlying git worktree is identical and available to both via the CLI; the asymmetry is in how much the tool manages for you, and it is the same design philosophy that runs through D3 and D4.