Maturity Matrix
v1.3June 1, 2026

June 2026: The Bill Comes Due

Fleets shipped, and so did the invoice. Every vendor made multi-agent orchestration the default, and in the same month the re-pricing arrived - Microsoft cancelled Claude Code internally on cost, Uber's COO questioned the ROI, and SpecBench showed reward hacking scales with codebase size.

Updated Guides

14 of 240 guides updated with May 2026 data. Remaining 226 carried forward unchanged.

Key Numbers

Taxonomy Changes

2026-05 2026-06 - May taxonomy preserved unchanged.

Development

Coding Agent Usage

L2Agent in IDE: Opus 4.8 default, Cursor 3.6 Run Mode
L3CLI agents: Opus 4.8 + xhigh effort, Codex, Antigravity CLI
L4Scheduled / unattended agents (Routines, Cursor /loop); 3-5 parallel via Run Mode, MultiDevin
L5Multi-agent orchestration: Claude Code dynamic workflows (script spawns dozens-to-hundreds of subagents)

Context Engineering

L3Context budgeting: automatic compaction built in (Rewind summarize, Amp at 90%)
L5Persistent memory: Dreaming / Kairos 4-stage consolidation
AreaAgent instruction files (CLAUDE.md, .cursorrules) are now an attack surface (TrapDoor injection)

Code Review & Quality

L3AI review agent: Opus 4.8 self-verification (~4x less likely to pass its own flaws)
L4Auto-approval anchored in outcomes, not benchmarks - SpecBench: reward hacking scales ~27pp per 10x LOC

Testing Strategy

L4Held-out oracles the agent never sees gate releases (SpecBench: validation lies as LOC grows)

Delivery Management

Metrics

AreaCost-per-merged-PR is now a CFO line item: Microsoft cancels Claude Code, Uber COO questions ROI, DORA J-curve, Goldman 24x
L4Auto-Approve Rate anchored in post-merge outcomes, not benchmark scores (SpecBench)

Governance & Compliance

L2EU AI Act: GPAI duties enforced Aug 2; OpenAI Frontier Governance Framework as a reference
L3Lint/review agent config as security-sensitive (CLAUDE.md, settings.json - Shai-Hulud/TrapDoor)

CI/CD Pipeline

L4Scheduled / async agents land PRs overnight (Routines, Cursor /loop, CI auto-fix)

Unchanged: Merge & Deploy

Organization

AI Adoption Model

AreaAI = dominant single cause of layoffs (40% of May cuts, Challenger), but redistribution to AI-engineering roles (+50-100% YoY)

Team Structure & Roles

L4Developer = fleet manager is now a product default (Cursor Run Mode, Claude agent view, Antigravity, MultiDevin)
L4Hiring shift: Yegge's 'The Last Technical Interview' (campfire trials, portable credentials)

Knowledge Management

L4Spec-Driven Development now a contested but defined methodology; skill-packs as shared versioned assets

Tech Debt & Modernization

L2Agentic Technical Debt (a stock) vs the Stochastic Tax (a flow-cost); DORA: gains collapse to ~10% on legacy

Infrastructure

Agent Runtime & Sandboxing

L3Harden agent config: ~/.claude/settings.json is now a persistence target (Mini Shai-Hulud)
L4Classifier-gated sandboxed execution as default (Cursor 3.6 Run Mode, Claude auto mode)
L5Local-first runtime as a privacy/latency path (antirez/ds4: DeepSeek V4 on-device via Metal, 1M context)

MCP & Tool Integration

L3MCP/tool config is a supply-chain attack surface (Shai-Hulud, TrapDoor) - treat installs as pinned, reviewed deps
L4MCP governance: per-skill/subagent/plugin/MCP cost attribution via /usage

Observability & Feedback Loop

L3Cost attributed per skill/subagent/plugin/MCP (Claude /usage)
L4ROI/J-curve dashboards (DORA); 'trust the methodology, not the number' (SpecBench/BenchJack)

Unchanged: Build System

What Didn't Change (and Why)

Stripe Minions as L5 north star - Still the cleanest public reference; dynamic workflows are now a buyable version of the pattern.
Lint-as-architecture, Bazel / EngFlow - Vendor-independent infrastructure. No reason to revise.
IPETs + bad-day protocol - Prior-edition org patterns still hold; June adds defending the spend.
Most L1 / L2 baseline - Foundations of AI adoption have not moved; what changed is what L3+ means.
Yegge's 8-stage individual model - Still the best public model for individual progression.

Sources