- Claim
- Experienced open-source developers using AI assistance on familiar repositories were 19% slower than the same developers without it, while predicting beforehand they would be 24% faster - a 43-point gap between expected speedup and measured slowdown that persisted in their self-reports even after the data contradicted it.
- Source
- Becker et al., METR, "Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity," July 10, 2025. arXiv: arxiv.org/abs/2507.09089. Writeup: metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/.
- Where used
- Chapter 4 (From generating code to shipping software).
- Caveat
- Tested raw AI assistance (Cursor + Claude) without a formulation-discipline variable. My interpretation that workflow discipline is the missing variable is mine, not the study's.
Appendix C. Sources and Further Reading
10 min read
This appendix exists because every claim in this manual deserves a verifiable source if you choose to chase it down. I have organized the entries by claim, not by source, so you can map back from a passage in the body to the evidence behind it. Entries are grouped by category (studies, named incidents, vulnerabilities with patch versions, tool documentation, marketplaces, memory primitive sources) and each entry follows the same shape: the claim, the source, where in the manual it is used, and any caveat worth knowing.
Studies and research¶
Named incidents¶
- Claim
- On April 24, 2026, PocketOS lost its production database in nine seconds when a Cursor agent powered by Claude Opus 4.6 invoked Railway's Volume Delete via a found API token during a credentials-mismatch recovery attempt. Backups stored on the same volume were destroyed with the primary data.
- Source
- Reported by DevOps.com ("When AI Goes Really, Really Wrong"), Business Insider (Jer Crane statement), and others. Anthropic's Claude Opus 4.6 system card (February 2026) describes the model that powered the agent.
- Where used
- Prologue (Nine seconds) and Chapter 3 (Governance in layers).
- Caveat
- Recovery timeline differs across public accounts - Railway's restore was reportedly ~30 minutes after Crane contacted them, while other accounts describe ~30 hours or two days for full operational restoration. I use the incident for the governance pattern, not as a precise forensic reconstruction.
- Claim
- In March 2026, Alexey Grigorev at DataTalks.Club lost two and a half years of course infrastructure when Claude Code worked against an incomplete Terraform state file, created duplicate resources where real ones existed, and ran destructive commands when the duplicates collided.
- Source
- Public account by Alexey Grigorev (DataTalks.Club), March 2026.
- Where used
- Chapter 3 (Governance in layers).
- Caveat
- Data loss was partial; recovery took weeks. The incident is documented publicly but with less coverage than PocketOS.
- Claim
- Anthropic published a technical post-mortem on April 23, 2026 acknowledging three product regressions that collectively broke Claude Code for complex engineering work between February 9 and March 26, 2026: adaptive thinking by default (Feb 9), default effort dropped from high to medium (March 3), and a caching bug in reasoning history retention (March 26). An AMD senior director's analysis of 6,852 Claude Code sessions and 234,760 tool calls showed the model shifting from research-first to edit-first behavior as thinking redaction rolled from 1.5% to 100% of turns.
- Source
- Anthropic technical post-mortem, April 23, 2026. AMD analysis published separately.
- Where used
- Chapter 4 (From generating code to shipping software).
- Caveat
- Independent analyses of code-quality degradation were less rigorous than the post-mortem; treat the magnitude as approximate.
Vulnerabilities with patch versions¶
- Claim
- Claude Code was vulnerable to remote code execution via untrusted project files: malicious
.mcp.jsonor.claude/settings.jsonfiles in untrusted repos could execute hooks before the trust dialog, enabling RCE. - Source
- Check Point Research, February 2026. CVE-2025-59536. NVD: nvd.nist.gov/vuln/detail/CVE-2025-59536. Writeup: research.checkpoint.com/2026/rce-and-api-token-exfiltration-through-claude-code-project-files-cve-2025-59536/.
- Where used
- Chapter 3 (Governance in layers); referenced in Chapter 10 (Adoption, security committee scene).
- Caveat
- Patched in Claude Code v1.0.111. Versions earlier than the patch remain vulnerable; the class survives even after the specific patch.
- Claim
- Claude Code was vulnerable to API-key exfiltration via configuration injection: attacker-controlled settings overriding
ANTHROPIC_BASE_URLbefore the trust prompt could leak API keys. - Source
- Check Point Research, February 2026. CVE-2026-21852.
- Where used
- Chapter 3 (Governance in layers).
- Caveat
- Patched in Claude Code v2.0.65. Same class as CVE-2025-59536: pre-trust execution of untrusted project configuration.
- Claim
- Claude Code automatically loads
.env*files in the working directory at session start without explicit user permission, exposing secrets to the agent's context. - Source
- Knostic, December 2025. Blog: knostic.ai/blog/claude-loads-secrets-without-permission.
- Where used
- Chapter 3 (Governance in layers), named in the dot-env auto-loading vulnerability class.
- Caveat
- Mitigation is sandbox
denyReadof the.env*patterns rather than a vendor patch. The behavior may change in future versions; the class (agents loading local config at session start) is enduring.
- Claim
- Claude Code's deny rules were silently bypassed when a shell command chained more than 50 subcommands (MAX_SUBCOMMANDS_FOR_SECURITY_CHECK = 50 hard cap), with the security check falling through to a generic "ask" prompt.
- Source
- Adversa AI Red Team, disclosed April 1, 2026. Writeup: adversa.ai/blog/claude-code-security-bypass-deny-rules-disabled/.
- Where used
- Chapter 3 (Governance in layers), as the parser-cap bypass example for "any single layer can have a quiet-failure mode."
- Caveat
- Patched in Claude Code v2.1.90 on April 6, 2026 (within a week of disclosure). The class - governance layers with parser caps that silently fail - is what to remember after the specific cap is gone.
- Claim
- Permission parsers in coding agents recognize only a known set of shell-read commands; agents invoking Python's
open(), Node'sfs.readFile, or any unrecognized binary bypass the deny rules entirely. - Source
- eve.gd (Eve Cailey), public writeup of the architectural class.
- Where used
- Chapter 3 (Governance in layers), as the permission-parser bypass class.
- Caveat
- Architectural, not a single CVE. Mitigation is the OS sandbox
denyReadlist (kernel-level), not a vendor patch. The class persists across patches because the parser cannot enumerate every binary.
Tool documentation¶
- Claim
- Codex CLI shipped Agent Skills as a first-class primitive in December 2025, with SKILL.md files using YAML frontmatter and progressive disclosure semantics comparable to Claude Code Skills.
- Source
- OpenAI Codex CLI docs, developers.openai.com/codex/skills.
- Where used
- Chapter 1 (The primitives), as the Codex side of the skill-primitive convergence.
- Caveat
- Vendor documentation; the GA dates are accurate as of mid-2026 but may be revised retroactively.
- Claim
- Codex CLI subagents went GA in early 2026 and can run up to eight in parallel.
- Source
- OpenAI Codex CLI docs, developers.openai.com/codex/.
- Where used
- Chapter 1 (The primitives) and Chapter 5 (the six-phase loop, Execute phase).
- Caveat
- Vendor documentation; parallel count may change with subsequent versions.
- Claim
- Codex CLI documents AGENTS.md as the convention for project-level agent instructions, loaded at session start and equivalent in role to other vendors' team-instruction files.
- Source
- OpenAI Codex CLI documentation, developers.openai.com/codex/agents-md.
- Where used
- Chapter 1 (The primitives, skills section) and Chapter 6 (AGENTS.md as team infrastructure).
- Caveat
- Filename and loading semantics are stable; specific frontmatter and discovery rules may evolve with versions.
- Claim
- AGENTS.md as the vendor-neutral team-instruction-file convention has native support across Codex CLI, Cursor, GitHub Copilot, Gemini CLI, Aider, Zed, and Windsurf. The format is markdown; the loading semantics are equivalent across tools.
- Source
- Cross-vendor documentation: Codex CLI (developers.openai.com/codex/agents-md), Cursor (cursor.sh/docs), GitHub Copilot (docs.github.com/copilot), Gemini CLI (cloud.google.com/gemini/docs/codeassist), Aider (aider.chat/docs), Zed (zed.dev/docs/ai), Windsurf (codeium.com/windsurf/docs).
- Where used
- Chapter 1 (The primitives, skills section) and Chapter 6 (Names and conventions).
- Caveat
- The list of supporting tools grows over time; the claim is that AGENTS.md is the de facto vendor-neutral convention, not that the list is exhaustive.
- Claim
- opencode is an open-source coding agent maintained by an independent team, written in TypeScript and licensed under MIT. Source-organized around the same primitives this manual identifies in Codex CLI and Claude Code.
- Source
- opencode repository (github.com/opencode-ai/opencode); LICENSE and README.
- Where used
- Chapter 1 (The primitives, source survey) and Chapter 2 (Anatomy invariant, two-agent demo).
- Caveat
- Project naming and maintainer composition may evolve; the architectural convergence claim survives renames.
- Claim
- Playwright drives a real browser through scripted interactions; the accessibility tree is the semantic structure browsers expose for assistive technology and is stable across visual restyles or component-library swaps. Tests written against the accessibility tree assert behavior rather than presentation.
- Source
- Playwright documentation (playwright.dev/docs/accessibility-testing); W3C ARIA Accessibility Object Model spec.
- Where used
- Chapter 5 (Verify), as the recommended frontend-verification pattern; Appendix B.3 checklist.
- Caveat
- Some UI behavior (animation, drag-and-drop, complex canvas surfaces) is not fully captured by the accessibility tree and needs supplementary verification.
- Claim
- Claude Code supports OS-level sandboxing on Linux (bubblewrap with Landlock and seccomp), macOS (Seatbelt), and Windows (restricted tokens with job objects), and is opt-in by configuration. Codex CLI enforces sandbox by default on Linux and macOS; you have to opt out, not opt in.
- Source
- Claude Code docs (code.claude.com/docs/en/sandboxing) and Codex CLI agent approvals and security docs (developers.openai.com/codex/agent-approvals-security).
- Where used
- Chapter 2 (Anatomy invariant, sandbox-divergence finding) and Chapter 3 (Governance in layers, layer two).
- Caveat
- Default-on versus opt-in is a versioned implementation detail. Verify the current default for your installed version before relying on it.
- Claim
- Cursor 2.0 introduced a subagent system; Cline shipped subagents natively; Claude Code added Agent Teams as a higher-level coordination layer on top of the Task tool.
- Source
- Vendor announcements and docs for Cursor, Cline, and Claude Code; collated across early-to-mid 2026.
- Where used
- Chapter 1 (The primitives), as evidence for subagent-primitive convergence within roughly a year.
- Caveat
- Vendor surface areas evolve; the convergence claim survives even when specific product names rebrand.
Marketplaces and plugin ecosystems¶
- Claim
- Anthropic's
claude-plugins-officialmarketplace ships built-in with Claude Code as of May 2026 and bundles skills, hooks, tools, and commands behind a single install command. The marketplace warns users to trust plugins before installing. - Source
- Claude Code docs (code.claude.com/docs/en/discover-plugins); the marketplace itself.
- Where used
- Chapter 1 (The primitives, plugins section).
- Caveat
- Plugin counts and marketplace policies will drift; the supply-chain discipline described in Chapter 1 is what to take away rather than any specific count.
Memory primitive sources¶
- Claim
- AGENTS.md is read at session start by Codex CLI, Cursor, GitHub Copilot, Gemini CLI, Aider, and the wider open-source coding-agent ecosystem (20+ vendors listed at agents.md as of 2026-05). Claude Code reads CLAUDE.md, which can import AGENTS.md to share the same content with other agents. The convergence puts AGENTS.md in the manually defined memory layer of the Memory primitive named in Chapter 1.
- Source
- agents.md (the open standard's site), plus vendor documentation for each agent listed.
- Where used
- Chapter 1 (The primitives, Memory section) and Chapter 6 (AGENTS.md as team infrastructure).
- Caveat
- The exact filename and load semantics vary by vendor - Claude Code reads CLAUDE.md (importable from AGENTS.md via
@AGENTS.mdor symlink); Cursor reads AGENTS.md plus.cursorrules. Convergence is on the structural role - user-written, always-loaded, team-shareable - not on byte-identical file format.
- Claim
- Claude Code maintains an auto-memory layer in which Claude writes notes for itself across sessions - build commands it figured out, debugging insights it confirmed, code-style preferences it inferred - distinct from the user-written CLAUDE.md. Requires Claude Code v2.1.59+; on by default; per-repo storage.
- Source
- code.claude.com/docs/en/memory.
- Where used
- Chapter 1 (The primitives, Memory section).
- Caveat
- Auto memory is Claude-Code-specific at the time of writing. Other coding agents are converging on similar mechanisms but had not shipped an equivalent at publication date.
- Claim
- Anthropic publicly unveiled Dreaming as part of Claude Managed Agents at Code with Claude SF on 2026-05-06 - a scheduled background process that reviews recent sessions and the memory store, identifies recurring mistakes and convergent workflows, and writes consolidated notes back into long-term memory. The Claude Code surface (
Auto Dream, accessible via/dream) shipped earlier as a research preview gated behind developer access and was documented in March 2026. - Source
- Code with Claude SF announcement, 2026-05-06; code.claude.com/docs/en/memory.
- Where used
- Chapter 1 (The primitives, Memory section).
- Caveat
- Auto Dream is Claude-Code-specific at publication date. The structural role is what this manual indexes, not the vendor.
Notes on currency
Most of the sources in this appendix are dated. Tool documentation updates frequently; vulnerability records get amended as patches ship and new variants surface. The frameworks in the body of the manual are intended to outlast any specific source URL. If a URL breaks, the underlying claim should still be searchable by the named incident, study, or product.