Unit E — Autopilot

Positions to absorb

E1 the reproducibility checker
E2 the lab manager's standing chores
E3 the onboarding the lab never has to repeat

A lighthouse is a lab that runs itself: the mechanism turns, the lamp burns, and the keeper’s real work is the schedule and the logbook. Unit E makes the project self-maintaining. The agent becomes a callable function — headless, structured output, no human at the keyboard — and a fresh clone proves the analysis rebuilds itself (E1); the monthly TLC drop arrives on a real cadence and ingests on one (E2); and the lab’s whole methodology ships as an installable kit, smoke-tested in an empty repo (E3). The Lighthouse is where your attention stops being the moving part.

Begin at E1 — Agents as Functions: Headless & CI →

The Lab Roster

Engraved positions, not portraits. A seat fills itself when its lesson is complete.

The expedition so far

Positions

the data manager

Position vacant — engaged at C2

write-time contract hooks (PreToolUse/PostToolUse + the validation suite)

est. human-RA: permanent vigilance — est. 2 weeks/year of load-checking and release-note reading agent: half a day to install and test the 9-line block; ~20 s per run thereafter
the methodologist

Position vacant — engaged at C1

the researcher skill library v1 (/clean-trips, /paper-summary, /demanding-adviser) — codified methodology, not macros

est. human-RA: the judgment lives in one head; transferring it to a new RA costs weeks of shadowing, and leaves when they do agent: an afternoon to author three SKILL.md files in both dialects; zero cost per session until invoked
the data engineer

Position vacant — engaged at C3

MCP connections + the DuckDB warehouse, enrichment joins (weather/events/holidays), and the zone-hour analysis panel

est. human-RA: days of bespoke glue per source — credentials, retries, schema spelunking, timezone forensics — re-debugged every time a source changes agent: register the server once; the agent explores INFORMATION_SCHEMA and builds the panel in a guided session, raw cached for replication
the RA pool

Position vacant — engaged at D1

parallel subagents with report contracts (EDA + scholarship fleets) + the isolated adviser

est. human-RA: a week of breadth EDA across boroughs and slices, plus a literature pass — and no honest outside critic you can summon at will agent: ~20 min to write the agent definition + report contract; the fleet runs in parallel; the isolated adviser critiques in minutes
the overnight RA

Position vacant — engaged at D3

/loop supervision + Goal Mode runs over background estimation

est. human-RA: one night shift per estimation batch — and the course runs several batches agent: ~10 min to write the check or the objective; the night itself belongs to the machine
the adviser

Position vacant — engaged at D1

parallel subagents with report contracts (EDA + scholarship fleets) + the isolated adviser

est. human-RA: a week of breadth EDA across boroughs and slices, plus a literature pass — and no honest outside critic you can summon at will agent: ~20 min to write the agent definition + report contract; the fleet runs in parallel; the isolated adviser critiques in minutes
the referee

Position vacant — engaged at D4

contracted fleet fan-out (results contract + provenance) and an isolated adversarial referee

est. human-RA: the curve is ~2 days of serialized edit-and-fit; the suspicious read of the robustness table is the rarer, senior hour nobody has time for agent: 13 lanes fanned out under the cap finish in an afternoon; the referee files its evidenced finding in one isolated pass
the lab manager

Position vacant — engaged at E2

scheduled/cloud agents — the monthly-ingest routine, stopping at a human-approved PR

est. human-RA: a recurring monthly chore nobody owns — check the CDN, pull, contract, append, re-estimate — reliably skipped agent: ~30 min to define the routine + guardrails once; each month runs unattended and stops at the approval gate
the reproducibility checker

Position vacant — engaged at E1

headless invocation + the fresh-clone replication self-test + CI gates

est. human-RA: a clean-room rebuild every few weeks — dull, exacting, and the first thing dropped at submission agent: ~20 min to wire scripts/replicate.sh and the gate workflow; the verdict returns in one headless run thereafter
the the wall — the unstaffed midnight hours between a raw file and a first plot

Position vacant — engaged at A1

the bare agent loop (prompt → act → observe → fix), zero configuration

est. human-RA: an evening or two per messy file — defensive parsing rewritten from scratch each project, rules forgotten by the time they work agent: ~10 minutes for the quick win, plus the same task re-run in the other language for free
the you, working an order of magnitude faster — but only if you direct the work

Position vacant — engaged at A2

the command surface + five prompting patterns + context hygiene

est. human-RA: the slow tax of an undriven session — drifted answers on long investigations, re-runs to find where it went wrong agent: ~30 min to learn; thereafter a first-look on one month (3.5M rows) in minutes, with receipts
the the lab manual nobody writes — the institutional knowledge that lives in your head

Position vacant — engaged at B1

instruction files (CLAUDE.md / AGENTS.md) + auto-memory + the A/B demonstration

est. human-RA: ~30 min re-onboarding every new RA, every time — plus the afternoons lost to landmines no one wrote down agent: written once in an hour; reloaded free at the start of every session thereafter
the careful senior who plans before touching data

Position vacant — engaged at B2

repo scaffold + pinned environments + read-only Plan mode reconnaissance

est. human-RA: ~1 week at project start (setup, download babysitting, plan review) + the joins redone when structure rots agent: an afternoon — most of it download wall-clock, not attention
the the lab whose members don't overwrite each other

Position vacant — engaged at D2

git worktrees — one isolated checkout per agent/session/thread, combined through a deliberate merge

est. human-RA: the lost afternoon disentangling two agents' colliding edits — and the redo when you reconstruct it wrong the first time agent: two commands to create the worktrees; the parallelism runs free; one reviewed merge at the end
the the onboarding the lab never has to repeat

Position vacant — engaged at E3

lab-kit — the whole methodology packaged as a one-command install

est. human-RA: six weeks of per-member onboarding, rediscovered from scratch every time the lab turns over agent: ~half a day to package and smoke-test the kit once; each new member is one install and one prompt
the the whole lab, orchestrated — the PI who designs the system instead of doing the work

Position vacant — engaged at F1

the research loop (/loop ↔ Goal Mode / @codex) orchestrating fleet → referee → headless re-run → regenerated report, under report-don't-act guardrails, a hard budget cap, and a human gate on substantive decisions only

est. human-RA: each revision is a serialized chain — re-spec, re-estimate, re-table, rewrite the paragraph, re-read the abstract — correct only as of the last manual pass, on a Sunday; a real reviewer round is days of hand-carried edits agent: the loop runs two iterations to convergence in one supervised sitting; the human stands at exactly one gate (approve dropping the post-treatment control) while the mechanical fixes proceed unattended

Running Totals

Lesson	Role	Est. human-RA	Agent (yours when measured)
A1	the wall — the unstaffed midnight hours between a raw file and a first plot	an evening or two per messy file — defensive parsing rewritten from scratch each project, rules forgotten by the time they work	~10 minutes for the quick win, plus the same task re-run in the other language for free
A2	you, working an order of magnitude faster — but only if you direct the work	the slow tax of an undriven session — drifted answers on long investigations, re-runs to find where it went wrong	~30 min to learn; thereafter a first-look on one month (3.5M rows) in minutes, with receipts
B1	the lab manual nobody writes — the institutional knowledge that lives in your head	~30 min re-onboarding every new RA, every time — plus the afternoons lost to landmines no one wrote down	written once in an hour; reloaded free at the start of every session thereafter
B2	careful senior who plans before touching data	~1 week at project start (setup, download babysitting, plan review) + the joins redone when structure rots	an afternoon — most of it download wall-clock, not attention
B3	the data manager who guards the raw files — the person who says no near the master copies	permanent vigilance you cannot staff — one lapse at machine speed costs a month of re-downloads	two profiles configured once in minutes; the fence then holds every session, tired or not
C1	the methodologist — the one person who knows how the lab actually decides	the judgment lives in one head; transferring it to a new RA costs weeks of shadowing, and leaves when they do	an afternoon to author three SKILL.md files in both dialects; zero cost per session until invoked
C2	data manager / QA who never sleeps	permanent vigilance — est. 2 weeks/year of load-checking and release-note reading	half a day to install and test the 9-line block; ~20 s per run thereafter
C3	the data engineer who wires the lab to its systems	days of bespoke glue per source — credentials, retries, schema spelunking, timezone forensics — re-debugged every time a source changes	register the server once; the agent explores INFORMATION_SCHEMA and builds the panel in a guided session, raw cached for replication
D1	the RA pool — and the adviser who critiques from outside	a week of breadth EDA across boroughs and slices, plus a literature pass — and no honest outside critic you can summon at will	~20 min to write the agent definition + report contract; the fleet runs in parallel; the isolated adviser critiques in minutes
D2	the lab whose members don't overwrite each other	the lost afternoon disentangling two agents' colliding edits — and the redo when you reconstruct it wrong the first time	two commands to create the worktrees; the parallelism runs free; one reviewed merge at the end
D3	overnight RA	one night shift per estimation batch — and the course runs several batches	~10 min to write the check or the objective; the night itself belongs to the machine
D4	an RA bench and the PI who keeps their results comparable	the curve is ~2 days of serialized edit-and-fit; the suspicious read of the robustness table is the rarer, senior hour nobody has time for	13 lanes fanned out under the cap finish in an afternoon; the referee files its evidenced finding in one isolated pass
E1	reproducibility checker	a clean-room rebuild every few weeks — dull, exacting, and the first thing dropped at submission	~20 min to wire scripts/replicate.sh and the gate workflow; the verdict returns in one headless run thereafter
E2	lab manager's standing chores	a recurring monthly chore nobody owns — check the CDN, pull, contract, append, re-estimate — reliably skipped	~30 min to define the routine + guardrails once; each month runs unattended and stops at the approval gate
E3	the onboarding the lab never has to repeat	six weeks of per-member onboarding, rediscovered from scratch every time the lab turns over	~half a day to package and smoke-test the kit once; each new member is one install and one prompt
F1	the whole lab, orchestrated — the PI who designs the system instead of doing the work	each revision is a serialized chain — re-spec, re-estimate, re-table, rewrite the paragraph, re-read the abstract — correct only as of the last manual pass, on a Sunday; a real reviewer round is days of hand-carried edits	the loop runs two iterations to convergence in one supervised sitting; the human stands at exactly one gate (approve dropping the post-treatment control) while the mechanical fixes proceed unattended
Positions absorbed		0 of 16

The honest column: every place a human had to step in lives in the Field Journal’s failure log. Your measured hours there override these estimates here.