Feature-parity matrix
The single, dated source of truth for cross-tool claims. Every lesson links here instead of restating parity. Tiers follow the parallel-coverage policy; the cheat sheet is the quick-reference dialect view of the same data. Version-sensitive claims carry an as of date and are revisited quarterly.
Tier 1 — Parity
Both tools implement the same concept with different syntax. In lessons these are tabbed panels: one shared why/when narrative, then [Claude Code] [Codex] tabs.
| Feature | Claude Code | Codex | Taught in | As of |
|---|---|---|---|---|
| built-in slash commands Everyday interaction verbs. Context management splits as `/clear` (reset) vs `/compact` (summarize). Taught in A2. | /model /clear /compact /rewind /permissions | /model /approvals /skills /agent | A2 | |
| CI integration Both run in GitHub Actions and gate merges. Codex additionally reviews via an `@codex` mention. Taught E1; refresh routine in E2. | GitHub Actions (claude-code-action) | Codex GitHub Action · @codex on PRs/issues | E1 | |
| concurrency control Both bound parallel-agent fan-out; defaults differ. Tuned in D4 (specification curve fleet). | per-workflow cap (≤16) | [agents] max_threads (default 6) | D4 | |
| frontier models Model selection + effort intro in A2; cost/economics revisited in F1. Version-sensitive — model line moves fast. watch this space Model tiers are the fastest-moving row here; treat names as of 2026-06 and recheck quarterly. | Claude (Fable / Opus / Sonnet / Haiku tiers) + effort levels | GPT-5.3-Codex / 5.4 / 5.5 + effort levels | A2 | |
| headless run Both expose a one-shot, scriptable, JSON-emitting run for CI and cron. Taught in E1 (headless rerun + gates). | claude -p "..." --output-format json | codex exec "..." --json | E1 | |
| hook events Event names are shared; the same research-native recipes (raw-write block, validation suites, row-count bounds) register on either tool. watch this space Codex's event list is still catching up to CC's full set — recheck quarterly. | PreToolUse, PostToolUse, SessionStart, SubagentStop, Stop, … | same vocabulary: PreToolUse, PostToolUse, SessionStart, SubagentStop, Stop, PreCompact, … | C2 | |
| hooks Both tools have stable hooks; CC adds agent-based judgment gates. watch this space Codex hook events still gaining parity with CC's full list — recheck quarterly. | hooks block in settings.json — PreToolUse, PostToolUse, SessionStart, SubagentStop, Stop, … | hooks.json or [hooks] in config.toml (stable v0.124+) — same event vocabulary | C2 | |
| install & auth The day-one onboarding is symmetric: both prompt before the first filesystem/network action. Taught in A1. | install + auth, approval prompts on first tool use | install + auth, approval prompts on first tool use | A1 | |
| instruction files Both load a project memory file at session start; CC adds auto-memory, Codex enforces a 32 KiB budget across its hierarchy. | CLAUDE.md (project + user scopes) · auto-memory · path-scoped .claude/rules/ | AGENTS.md hierarchy (global ~/.codex/, repo root, nested dirs) · AGENTS.override.md · 32 KiB budget | B1 | |
| MCP registration Same Model Context Protocol; DuckDB / database servers register identically. Cache-raw-then-derive is a shared pattern. | .mcp.json / claude mcp add | [mcp_servers] in config.toml / codex mcp add | C3 | |
| plugins & marketplaces Both distribute reusable extensions through marketplaces. Taught in E3 (lab kit / publishing) — the expedition continues past launch. | plugins + marketplaces | plugins + marketplaces (agentskills.io publishing) | E3 | |
| safety modes Both pair an approval/permission policy with OS-level sandboxing. Named risk profiles (exploration vs pipeline) guard everything after B3. | permission modes + rules + OS sandbox | approval modes + OS sandbox (Seatbelt / Landlock) | B3 | |
| SDK Programmatic embedding of the agent. CC ships a Python SDK; Codex is TS-first. Used in E1 automation. watch this space A first-party Python Codex SDK has not shipped as of 2026-06 — recheck quarterly. | Agent SDK (Python + TypeScript) | Codex SDK (TypeScript) | E1 | |
| settings Same layered-override idea, different format (JSON vs TOML). Codex profiles map onto CC's named permission modes. | .claude/settings.json (user / local / managed layers) | ~/.codex/config.toml (+ named profiles, layered .codex/ team config) | B3 | |
| skills SKILL.md anatomy is shared; the agentskills.io standard means a skill authored for one tool largely ports to the other. | .claude/skills/<name>/SKILL.md — invoke /name | .agents/skills/<name>/SKILL.md — invoke $name (agentskills.io standard, $skill-creator) | C1 | |
| subagent definitions Both define named subagents with report-back contracts; the critic-isolation pattern is tool-agnostic. D1 teaches the parallel fan-out. | .claude/agents/<name>.md (markdown + frontmatter) · built-ins Explore / Plan / general-purpose | .codex/agents/<name>.toml (TOML) · built-ins default / worker / explorer | D1 |
Tier 2 — Same job, different primitive
Both tools solve the problem with non-isomorphic mechanisms — each gets a full native treatment, followed by a short translation guide. Claude Code favors composable local primitives; Codex favors managed cloud delegation. The asymmetry itself is the lesson.
| Feature | Claude Code | Codex | Taught in | As of |
|---|---|---|---|---|
| automatic-research orchestration Same job — a research system that runs itself: estimate, refer, triage, revise in isolation, re-run headlessly, regenerate the report, repeat to convergence under 'report, don't act' guardrails. CC composes the loop from local primitives (`/loop` + subagent referee + worktrees + headless reruns); Codex frames it as Goal Mode driving cloud tasks with best-of-N. The capstone (F1) is exactly this composition. Translation: a CC `/loop` orchestrating worktree reruns ≈ a Codex Goal Mode driving parallel cloud tasks. watch this space End-to-end self-improving research loops are bleeding-edge on both sides as of 2026-06 — recheck quarterly. | /loop orchestration over subagents + worktrees + headless reruns (compose A–E locally) | Goal Mode + cloud tasks + best-of-N (managed cloud orchestration) | F1 | |
| image input Both read images (plots, paper screenshots). Codex extends to live-UI capture; CC reads static images. Everyday use in A2; the figure loop in D1. Translation: CC + Playwright MCP screenshots ≈ Codex Appshots. | paste plots / paper screenshots into the prompt | image input + in-app browser / computer use / Appshots | A2 | |
| long-running supervision Same job — keep an agent working toward a goal over a long horizon. CC composes a recurring prompt; Codex runs an objective-framed Goal Mode. Translation: frame a `/loop` prompt as an objective and it approximates Goal Mode. First taught in D3 (overnight work); the F1 capstone rescopes `/loop` as the orchestration loop that drives the whole automatic-research system to convergence. | /loop (recurring self-paced prompt) + background agents | Goal Mode (objective-driven multi-hour run) | D3 F1 | |
| massively parallel fan-out Same job — fan one task across many inputs (the D4 specification curve). CC scripts a JS pipeline over a manifest; Codex spawns agents from a CSV and picks best-of-N in the cloud. Translation: a Workflow `pipeline()` over a manifest ≈ `spawn_agents_on_csv`. | Dynamic Workflows (pipeline() JS orchestration) + ultracode | spawn_agents_on_csv + cloud best-of-N | D4 | |
| parallel-agent isolation Same job — run agents in parallel without collision — but CC favors composable local worktrees while Codex favors managed cloud isolation. The asymmetry IS the lesson (D2). Translation: a CC worktree per workstream ≈ a Codex cloud task per workstream. | git worktrees + worktree-spawned agents (composable local primitive) | desktop-app parallel threads (native worktrees) + isolated cloud tasks (managed delegation) | D2 | |
| scheduled autonomous work Same job — fire an agent on a schedule or event, with the 'report, don't act' guardrail. Taught in E2 (data-refresh routine). Translation: a CC Routine on cron ≈ a scheduled Codex cloud task. | Routines (cloud; cron / GitHub / API triggers) | Codex cloud tasks + GitHub integration | E2 | |
| see a UI / web page Same job — let the agent observe a rendered page. CC drives a Playwright MCP server; Codex has native computer use. Taught alongside MCP in C3 and the figure loop D1. | Playwright MCP screenshots | computer use · in-app browser · Appshots | C3 | |
| sessions, resume & fork Same job — branch and revisit prior state — different primitive. CC has native checkpoints; Codex leans on conversation forks + git. Translation: a CC checkpoint ≈ a Codex fork point you can git-reset back to. | sessions / resume / fork · checkpoints + /rewind | conversation forking + git · codex exec resume | B2 |
Tier 3 — Genuinely one-sided
The owning tool gets the page; the other gets a clearly labeled nearest-equivalent — never a fake tab implying parity. Vendors converge quickly, so each row carries a watch-this-space note.
| Feature | Claude Code | Codex | Owner | Taught in | As of |
|---|---|---|---|---|---|
| agent teams CC's experimental agent teams coordinate roles directly; Codex approximates with several cloud tasks over a shared repo. Frontier sidebar in F1 — the expedition continues. watch this space Experimental on the CC side as of 2026-06 — recheck quarterly. | agent teams (experimental) — coordinated multi-agent roles | nearest equivalent: multiple cloud tasks + shared repo | Claude Code | F1 | |
| agent-based hook gates Beyond shared command hooks (Tier 1), CC can run an LLM-judge as a hook. Codex approximates the judgment with command gates plus its Automatic Review Agent. Taught in C2. watch this space No in-hook LLM-judge primitive in Codex as of 2026-06 — recheck quarterly. | prompt / agent-based hooks — an LLM-judge gate inside the hook | nearest equivalent: command gates + Automatic Review Agent | Claude Code | C2 | |
| auto-memory CC can self-maintain project memory. Codex relies on deliberate AGENTS.md edits within its 32 KiB budget. Taught alongside instruction files in B1. watch this space No automatic AGENTS.md self-update in Codex as of 2026-06 — recheck quarterly. | auto-memory — the agent updates CLAUDE.md from the conversation | nearest equivalent: deliberate AGENTS.md updates (you / the agent edit it explicitly) | Claude Code | B1 | |
| automatic PR review Codex ships a first-class reviewer you summon with an `@codex` mention (and its Automatic Review Agent runs on every PR). CC reaches the same outcome by wiring a review subagent or an agent-based judgment gate into its GitHub Action. Taught in E1 (headless reruns + merge gates). Translation: an `@codex` PR review ≈ a CC review subagent invoked from claude-code-action. watch this space No first-party `@mention` PR-reviewer bot on the CC side as of 2026-06; it composes one from the Action + a subagent — recheck quarterly. | nearest equivalent: a review subagent / agent-based gate inside the GitHub Action (claude-code-action) | @codex on a PR/issue · the Automatic Review Agent reviews every PR | Codex | E1 | |
| computer use / Appshots Codex natively controls a GUI / browser. CC reaches the same outcomes through a Playwright MCP server. Taught in C3 (data access) and D1. watch this space CC has no native computer-use as of 2026-06; relies on MCP — recheck quarterly. | nearest equivalent: Playwright MCP (drive + screenshot a browser) | in-app browser · computer use · Appshots (native GUI control) | Codex | C3 | |
| Linear / app integrations Codex ships native integrations (e.g. Linear). CC connects the same apps through MCP servers. Mentioned in E2. watch this space Integration catalogs on both sides move fast — recheck quarterly. | nearest equivalent: MCP servers (connect the app via MCP) | first-party Linear + app integrations | Codex | E2 | |
| NotebookEdit + chart outputs CC edits notebooks cell-by-cell and reads rendered chart outputs. Codex works the notebook as a script (jupytext) and reads exported chart images. Taught in D1 (figure loop). watch this space No native Codex notebook-cell editor as of 2026-06 — recheck quarterly. | NotebookEdit — edits .ipynb cells directly and reads chart outputs | nearest equivalent: jupytext / script workflow + image input | Claude Code | D1 | |
| output styles CC has a named output-style switch; Codex achieves the same with explicit style directives in AGENTS.md. One-line mention in F1 (report voice). watch this space No named output-style switch in Codex as of 2026-06 — recheck quarterly. | output styles — switch the agent's writing voice/format | nearest equivalent: style directives in AGENTS.md | Claude Code | F1 | |
| Plan mode CC has a first-class read-only planning mode. Codex approximates it with a read-only approval profile plus its explorer subagent — not a fake tab. Taught in B2. watch this space Codex has no dedicated Plan mode as of 2026-06; vendors converge fast — recheck quarterly. | Plan mode + built-in Explore / Plan subagents | nearest equivalent: read-only approval profile + explorer subagent | Claude Code | B2 | |
| spawn_agents_on_csv Codex spawns a fleet straight from a CSV. CC expresses the same fan-out as a Workflow pipeline over a manifest. Both drive the D4 specification curve. watch this space `spawn_agents_on_csv` is experimental as of 2026-06 — recheck quarterly. | nearest equivalent: Dynamic Workflow pipeline() over a manifest | spawn_agents_on_csv (experimental) — one agent per CSV row, cloud best-of-N | Codex | D4 |
The expedition continues
Lessons light up their chips as they ship. Units A–D are live, so every row tagged to them deep-links to the lesson that teaches it. Rows tagged to Units E–F are already dated and tracked here so the matrix is complete and honest — their chips are drawn faint until each lesson's route exists, then become live links automatically (the gate reads published state, never a hardcoded range).
Every claim above is dated YYYY-MM and revisited quarterly via the
automated parity-refresh issue. Model lines, experimental features, and
version-gated hooks move fastest — trust the date, not your memory.