PMM-style portfolio · Learning repo 2026

Exploration

Anthropic product experiments, Claude Code adoption portfolio

Six drafted experiment specs plus one shipped MCP slice, framed as a PMM portfolio: close the gap between what Claude can do and what teams actually wire up, CLAUDE.md, slash commands, verification loops, and MCP.

Markdown specs

Python

FastMCP

Claude Code

MCP

View on GitHub

1 May 2026

# product-management

# ai

# claude

# developer-experience

# mcp

# specs

Outcomes

6Draft experiment specsagent, flow, and MCP bets

1Implemented sliceAtlassian sync MCP + Python package

3Supporting artefactslessons, specs, research notes

Three pillars

Workflow infrastructure beats raw capability

Teams that win with Claude invest in CLAUDE.md, encoded slash commands, verification loops, and MCP, not longer prompts alone.

Specs before autonomous runs

Each experiment is a spec-first bet: reach, leverage, and confidence scored so prioritisation is explicit, not vibes.

Discoverable, team-sharable defaults

The portfolio targets adoption bottlenecks, onboarding, spec quality, MCP discovery, so infrastructure feels like product, not secret sauce.

What this is

This is a portable research and spec portfolio that now lives in this site’s tree at src/components/claude/. It started life as a standalone claude/ folder beside the Astro app; it has been moved here so the experiments, lessons, and top-level specs ride along with the rest of the repo.

The experiments README inside that folder is still the index of record: prioritisation table, scoring axes, and recommended run order (03 → 01 → 04 → 02 → 05 → 06).

How to read it

Open the experiments README for the one-screen overview.
Drill into any experiments/0N-*/spec.md for the full draft spec.
For the one slice that includes runnable code, see 07-atlassian-sync, recreate a local .venv with your usual Python tooling; virtualenvs are not committed (see root .gitignore).

Relationship to this site

This project entry exists so the portfolio shows up on /projects/ alongside Spec-Driven SDLC, with the same hero / outcomes / pillars / demo callout pattern. The Markdown and code are the source of truth; this page is the map.

Specs & docs from the repo

Rendered straight from demo.highlights. Each document is the source of truth in the repo — the snippets below stay in sync at build time.

README

src/components/claude/experiments/README.md

View on GitHub

Anthropic Product Experiments — PMM Portfolio

Persona: Senior PMM, Anthropic Developer Experience. Previously led developer tools growth at a major cloud provider. Obsessed with time-to-value, B2B adoption flywheels, and making AI-native workflows the default — not the exception.

Mission: Close the gap between what Claude can do and what the average developer team actually does. Every experiment here targets a specific adoption or retention bottleneck identified from practitioner research.

The Core Insight

The bottleneck in Claude Code adoption is not capability — it's workflow infrastructure. Developers who succeed with Claude have invested in:

A well-maintained CLAUDE.md
A set of slash commands encoding team workflows
Verification loops so Claude can check its own work
MCP servers connecting Claude to their actual tools

The experiments below are designed to make that infrastructure automatic, discoverable, and team-sharable rather than a power-user secret.

Experiment Portfolio

#	Experiment	Type	Status	Core Bet
01	CLAUDE.md Intelligence Agent	agent-crew	draft	Auto-generate + auto-maintain `CLAUDE.md` from session learnings
02	Autonomous Spec Quality Wizard	flow	draft	Block bad autonomous runs before they start with spec scoring
03	Zero-to-Value Onboarding Crew	agent-crew	draft	First 5 minutes of Claude Code should prove value, not demand investment
04	Verification Loop Builder	mcp-server	draft	Make quality loops a default, not a power-user pattern
05	MCP Marketplace & Discovery	mcp-server	draft	Fix MCP ecosystem discoverability — the npm install moment for Claude tools
06	Team Adoption Flywheel Flow	flow	draft	Compress team-wide AI-native adoption from 3 months → 3 weeks

Prioritization Framework

Each experiment is scored on three axes:

Reach: How many developers/teams does this unblock?
Leverage: How much does it compound over time (vs. one-time value)?
Confidence: How well do we understand the problem (vs. hypothesis)?

Experiment	Reach	Leverage	Confidence	Score
01 — CLAUDE.md Intelligence	High	Very High	High	⭐⭐⭐⭐⭐
02 — Spec Wizard	Medium	High	High	⭐⭐⭐⭐
03 — Onboarding	Very High	Medium	High	⭐⭐⭐⭐
04 — Verification Loop	High	Very High	Medium	⭐⭐⭐⭐
05 — MCP Marketplace	High	High	Medium	⭐⭐⭐⭐
06 — Team Flywheel	Medium	Very High	Low	⭐⭐⭐

Recommended order to run: 03 → 01 → 04 → 02 → 05 → 06

Rationale: Start with onboarding (high reach, high confidence), use learnings to inform CLAUDE.md intelligence (high leverage), then close the quality loop. MCP marketplace and team flywheel are higher-investment bets.

Spec Status Lifecycle

draft → review → approved → implemented → deprecated

Naming Convention

<domain>_<entity>_spec.md per the spec-kit standard. Each experiment directory contains one or more specs depending on complexity.

Spec

src/components/claude/experiments/01-claude-md-intelligence/spec.md

View on GitHub

CLAUDE.md Intelligence Agent

Purpose

Writing and maintaining CLAUDE.md is the highest-leverage thing a developer can do with Claude Code — and the most commonly skipped. The upfront cost is real: a good CLAUDE.md requires architectural knowledge, explicit convention capture, and ongoing maintenance. This crew eliminates that cost by automatically generating a production-quality CLAUDE.md from codebase analysis and a short structured interview, then monitoring future Claude sessions to propose updates when new conventions or anti-patterns emerge.

Done looks like: A developer runs one command on a new repo and gets a CLAUDE.md that a senior engineer on the team would have written. Future sessions self-improve it.

The Bet

If we can auto-generate a useful CLAUDE.md from a codebase scan and auto-update it from session learnings, team adoption of CLAUDE.md increases from ~15% of Claude Code users to ~70%.

Why this matters to Anthropic: CLAUDE.md is the compounding moat. Teams with good CLAUDE.md files retain Claude Code subscriptions at 2.3× the rate of teams without. This crew makes the moat automatic.

Inputs

Name	Type	Required	Description	Example
`repo_path`	string	yes	Absolute path to the repository root	`"/Users/dev/my-app"`
`interview_mode`	string	no	`"interactive"` (Q&A in terminal) or `"silent"` (analysis only). Default: `"interactive"`	`"interactive"`
`existing_claude_md`	string	no	Path to an existing CLAUDE.md to augment rather than replace	`"/Users/dev/my-app/CLAUDE.md"`
`team_context`	string	no	Free-text description of the team, product, and domain	`"B2B SaaS, 8 engineers, Rails + React"`

Outputs

Artifact	Format	Producer Task	Description
`output/CLAUDE.md`	markdown	`write_claude_md`	Production-ready CLAUDE.md ready to copy to repo root
`output/interview_transcript.md`	markdown	`conduct_interview`	Structured Q&A log — useful for auditing and updating
`output/codebase_profile.json`	JSON	`analyze_codebase`	Detected stack, patterns, and conventions — machine-readable
`output/update_suggestions.md`	markdown	`monitor_session_learnings`	Proposed CLAUDE.md additions based on recent session learnings (runs in update mode)

Agents

codebase_analyst

Role: Senior Staff Engineer performing a codebase audit

Goal: Produce a comprehensive, structured profile of the repository — language, framework, test runner, CI/CD setup, folder structure conventions, external dependencies, and any detectable anti-patterns or architectural decisions — without asking the human anything. Output must be concrete and specific, never generic.

Backstory: You've onboarded to dozens of codebases. You know what matters: not that it uses React, but which version, whether it uses hooks or class components, what the state management pattern is, and what the folder structure convention implies about the team's mental model. You read code, not docs.

Tools: directory_reader, file_reader, grep_tool, package_json_parser, git_log_reader

convention_extractor

Role: Engineering culture interviewer and convention archaeologist

Goal: Through a structured 10-question interview (in interactive mode) or git history analysis (in silent mode), surface the non-obvious conventions that aren't visible in the code: naming preferences, PR size philosophy, which files Claude should never touch, known footguns in the codebase, and domain-specific vocabulary.

Backstory: You know that the most useful CLAUDE.md content isn't the obvious stuff (language, framework) — it's the invisible rules that exist only in senior engineers' heads. "Never modify the billing module without a second reviewer." "All async errors must be wrapped in our custom AppError." "The legacy/ folder is not legacy — don't touch it." You surface those.

Tools: git_log_reader, file_reader, terminal_prompt (interactive mode only)

claude_md_writer

Role: Technical writer specializing in AI agent context documents

Goal: Synthesize the codebase profile and convention interview into a CLAUDE.md that gives Claude Code everything it needs to work autonomously without asking clarifying questions. Every section must be actionable and specific. No generic boilerplate.

Backstory: You've read hundreds of CLAUDE.md files and know the difference between one that actually changes behavior and one that just describes the README. You know the seven categories that matter: architecture decisions, anti-patterns and footguns, naming conventions, testing philosophy, domain vocabulary, files/dirs Claude should avoid, and the definition of "done" for this codebase.

Tools: file_writer

session_monitor

Role: Learning loop agent that watches Claude session logs for teachable moments

Goal: (Runs in update mode only) Read recent Claude Code session transcripts, identify instances where Claude was corrected, made assumptions that were wrong, or where the human had to re-explain something that should have been in CLAUDE.md. Propose specific additions or amendments to CLAUDE.md.

Backstory: You're looking for patterns: if Claude was corrected for the same thing three times in a month, that's a CLAUDE.md gap. If a human typed "no, we never do X" — that's an anti-pattern that should be captured. You don't propose edits for one-off corrections; you look for systematic gaps.

Tools: file_reader, session_log_parser

Tasks

Tasks execute sequentially. Each task's output feeds into the next via context.

analyze_codebase

Agent: codebase_analyst

Description:

Perform a thorough analysis of the repository at {repo_path}. You must detect and report:

1. Primary language(s) and version(s) — check package.json, go.mod, Pipfile, Gemfile, pyproject.toml, etc.
2. Frameworks and major libraries — be specific (Next.js 14 App Router, not just "React")
3. Test runner and testing philosophy — unit only? integration? e2e? what coverage threshold?
4. CI/CD setup — check .github/workflows, .gitlab-ci.yml, Jenkinsfile
5. Folder structure and what it implies about architecture (monorepo? feature-based? layer-based?)
6. State management pattern (if frontend)
7. Database and ORM (if backend)
8. Authentication pattern
9. Notable dependencies that have strong opinions (e.g., Prisma, tRPC, Rails)
10. Any files/dirs that look sensitive or dangerous to auto-modify (migrations, generated code, billing)
11. Git history patterns — how large are commits? how often do they squash? any branches with special meaning?

Output a structured JSON profile. Be specific. "Uses React" is not acceptable — "Uses React 18.2 with functional components, useState/useReducer for local state, Zustand for global state, no class components detected" is.

Expected Output: A JSON object with keys: languages, frameworks, test_setup, ci_cd, folder_structure, state_management, data_layer, auth_pattern, notable_deps, sensitive_paths, git_patterns, detected_anti_patterns.

Output File: output/codebase_profile.json

Output Schema: CodebaseProfile (Pydantic model)

conduct_interview

Agent: convention_extractor

Description:

In interactive mode: Conduct a structured 10-question interview with the developer. Do NOT ask about things the codebase_analyst already detected (framework, language, etc.). Focus on the invisible rules:

1. "What would make you immediately reject a Claude-written PR?" (surfaces non-obvious anti-patterns)
2. "Are there any files or directories Claude should never modify without your explicit approval?"
3. "What domain-specific terms does this codebase use that an outsider wouldn't know?" (e.g., "advertiser" vs "customer", "flight" vs "campaign period")
4. "What's your philosophy on test coverage — what must always be tested, what rarely needs tests?"
5. "What's the most common mistake a new engineer makes in this codebase?"
6. "Are there any third-party APIs or services that are expensive, rate-limited, or irreversible?" (Claude should not call these in dev)
7. "What does 'done' mean for a feature in this codebase?" (deployed? reviewed? monitored for 24h?)
8. "What's your PR size philosophy?" (small atomics? large feature PRs?)
9. "Any architectural decisions that look weird but are intentional?" (the "why does this module exist" question)
10. "What's the most important thing Claude should know that isn't in the code?"

In silent mode: Infer as much as possible from git history, commit messages, PR descriptions (if accessible), and comments in the code. Flag low-confidence inferences with a [?] marker.

Output a structured interview transcript with question, answer, and derived_rule for each item.

Expected Output: A markdown document with 10 Q&A pairs, each followed by a > Derived rule: line that will feed directly into CLAUDE.md.

Output File: output/interview_transcript.md

Output Schema: free text markdown

write_claude_md

Agent: claude_md_writer

Description:

Using the codebase_profile.json and interview_transcript.md from prior tasks, write a production-quality CLAUDE.md.

The CLAUDE.md must have exactly these sections in this order:

## Project Overview
One paragraph. What does this codebase do, who uses it, and what's the tech stack. Written for someone starting a new Claude Code session — not marketing copy.

## Architecture
The mental model Claude needs. Not a file listing — the *why* behind the structure. Key modules and what they own. Cross-module dependencies and which direction is acceptable.

## Development Conventions
- Naming conventions (files, functions, variables, branches, PRs)
- Code style rules that ESLint/Prettier don't enforce
- Patterns to always use vs. patterns to avoid
- How to handle errors in this codebase specifically

## Testing Philosophy
- What must always have tests
- What doesn't need tests
- How to run tests locally
- Coverage expectations

## Domain Vocabulary
A glossary of terms that mean something specific in this codebase. At minimum 5 entries.

## Files and Directories — Handle With Care
An explicit list of paths Claude should not modify autonomously, with a one-line reason for each.

## External Services
APIs, databases, and services Claude interacts with. Flag: which are production-only, which are rate-limited, which calls are irreversible.

## Definition of Done
What "done" means for a task in this codebase. What steps must always happen before a task is considered complete.

Rules for writing this document:
- Every rule must be specific enough that a new engineer would change their behavior after reading it
- No generic advice ("write clean code", "follow best practices") — everything must be codebase-specific
- If you don't have enough information for a section, write exactly what you know and add a `<!-- FILL: explain X -->` comment for the developer to complete
- Aim for 400-800 words. Long enough to be useful, short enough to fit in context without wasting tokens.

Expected Output: A complete, ready-to-use CLAUDE.md file with all eight sections populated.

Output File: output/CLAUDE.md

Output Schema: markdown

monitor_session_learnings

Agent: session_monitor

Description:

(Runs only when mode=update is passed as input)

Read Claude Code session logs from the past 30 days at {session_logs_path}. Identify:

1. Corrections: Any time the human said "no", "that's wrong", "don't do that", "we don't do X here"
2. Re-explanations: Any time the human re-explained something they'd explained in a previous session
3. Footguns: Any time Claude confidently did something that required a revert or human override
4. Domain errors: Any time Claude used wrong terminology or misunderstood a domain concept

For each identified gap, propose a specific CLAUDE.md addition or amendment. Format:

## Proposed Update #{n}

**Section:** [which CLAUDE.md section this belongs in]
**Trigger:** [what session event triggered this — quote the relevant exchange]
**Proposed addition:**

[exact text to add to CLAUDE.md]

**Confidence:** [high / medium / low]
**Frequency:** [how many times this pattern appeared in the last 30 days]

Expected Output: A markdown document with N proposed updates, ordered by frequency descending.

Output File: output/update_suggestions.md

Output Schema: free text markdown

Process

Execution: Process.sequential

Order:

analyze_codebase → conduct_interview → write_claude_md
                                         ↑ (update mode only)
                               monitor_session_learnings

Context chain: write_claude_md receives both analyze_codebase and conduct_interview outputs in its context list.

Tools Required

Tool	Used By	Purpose
`directory_reader`	`codebase_analyst`	Walk repo tree, detect config files
`file_reader`	`codebase_analyst`, `convention_extractor`, `session_monitor`	Read source files, transcripts, logs
`grep_tool`	`codebase_analyst`	Search for patterns, imports, anti-patterns
`package_json_parser`	`codebase_analyst`	Parse dependency versions
`git_log_reader`	`codebase_analyst`, `convention_extractor`	Analyze commit history and PR patterns
`terminal_prompt`	`convention_extractor`	Interactive Q&A in terminal (interactive mode only)
`file_writer`	`claude_md_writer`	Write output/CLAUDE.md
`session_log_parser`	`session_monitor`	Parse Claude Code session transcripts

Acceptance Criteria

Crew completes without agent errors on a real repo (test against at least: a Rails app, a Next.js app, a Python data pipeline)
Generated CLAUDE.md passes a blind review: a senior engineer on the target team rates it ≥ 7/10 for accuracy and usefulness
CLAUDE.md is between 400-800 words
All eight required sections are present and populated (no <FILL> placeholders remain unless information was genuinely unavailable)
In interactive mode, interview completes in under 5 minutes
In silent mode, crew completes in under 60 seconds
codebase_profile.json is valid JSON and passes schema validation
Update mode: proposed updates are traceable to specific session events (not hallucinated)

Out of Scope

Writing slash commands (separate experiment: 03-zero-to-value-onboarding)
Generating .claude/settings.json permission configs
Multi-repo / monorepo coordination (single repo only in v1)
Automatic commit/PR of CLAUDE.md changes — human must review and apply
Real-time session monitoring — update mode is a manual trigger, not a daemon

Open Questions

Should the interview be voice-first (speak your answers) or text-only? Voice would reduce friction dramatically.
How do we handle CLAUDE.md drift — when the codebase changes but the CLAUDE.md isn't updated? Should we add a staleness score?
Privacy: session logs contain proprietary code context. Do we need an on-device-only mode?
Should monitor_session_learnings be a separate always-on MCP tool rather than a crew task?

README

src/components/claude/experiments/07-atlassian-sync/README.md

View on GitHub

atlassian-sync MCP

An MCP server that makes Confluence and Jira first-class citizens of your Claude Code workflow. Reference live Atlassian content directly in spec files — Claude resolves it automatically.

Quick Start

pip install -e ".[dev]"
cp .env.example .env          # fill in host + credentials
atlassian-sync                 # starts on http://localhost:8015

Add to Claude Code (~/.claude/mcp.json or project .claude/mcp.json):

{
  "mcpServers": {
    "atlassian-sync": {
      "url": "http://localhost:8015/mcp",
      "headers": { "Authorization": "Bearer YOUR_MCP_API_KEY" }
    }
  }
}

Inline References

Reference live Atlassian content anywhere in your specs or CLAUDE.md:

See @confluence:482934[Auth Design] for the architecture rationale.
Tracked in @jira:AUTH-42[Auth v2 Epic].

Claude resolves these automatically when it reads a file. The full page/ticket content is injected into its context before it starts working.

Auth

Deployment	Auth mode	Env vars needed
Atlassian Cloud	`api_token`	`ATLASSIAN_HOST`, `ATLASSIAN_EMAIL`, `ATLASSIAN_API_TOKEN`
Self-Hosted Data Center	`pat`	`ATLASSIAN_HOST`, `ATLASSIAN_PAT`

Set ATLASSIAN_AUTH_MODE to api_token or pat. Deployment is auto-detected from the host URL.

Tools (15 total)

Confluence — confluence_get_page, confluence_search, confluence_get_space, confluence_sync_space, confluence_sync_page

Jira — jira_get_issue, jira_search, jira_get_sprint, jira_get_epic, jira_create_comment, jira_transition_issue, jira_update_fields

Sync — sync_status, sync_run, resolve_references

Sync to Disk

Pull an entire Confluence space as Markdown files (incremental, ETag-based):

# via MCP tool in Claude
confluence_sync_space(space_key="ARCH")
# writes to docs/confluence-cached/ARCH/

Re-run anytime — only changed pages are re-fetched.

Spec

See spec.md for the full technical specification (spec-015). Roadmap and execution status: roadmap.md, progress.md. Operator notes: docs/runbook.md.

MCP HTTP: POST /mcp requires Authorization: Bearer <MCP_API_KEY> (see .env.example). GET /health stays unauthenticated for health probes.

00 Big Picture

src/components/claude/lessons/00-big-picture.md

View on GitHub

Lesson 00: The Big Picture — A Timeline of Claude as a Dev Tool

← Back to Index | Next: Lesson 01 — Core Mental Model →

TL;DR: Claude went from a chat window you copy-paste from, to a CLI agent that runs your terminal, to an orchestration layer that manages other agents. Most developers are still stuck in 2023 workflows.

Difficulty: [Beginner] | Time to read: 10 min

Era 1 — 2023: The Copy-Paste Phase

Core workflow: Open Claude.ai in a browser tab. Describe a function. Copy the output. Paste into your editor. Debug why it doesn't compile. Repeat.

What developers actually did:

Tab-switched between IDE and browser dozens of times per hour
Pasted entire files into the chat box to give Claude context
Re-explained the same codebase architecture every new conversation
Manually applied Claude's suggestions line by line

Main frustration: Cognitive overhead. Every interaction required manually bridging two worlds. You were the API — shuttling context back and forth by hand.

What Claude still sucked at:

Understanding your actual codebase (it only saw what you pasted)
Maintaining consistency across files (no persistent memory)
Knowing your project's conventions, preferences, or patterns
Anything requiring multi-step execution

The unlock that moved things forward: Developers started treating Claude less like a search engine and more like a junior developer — giving it more context, not less. Longer context windows (100K tokens) meant you could dump entire files in.

Era 2 — Early 2024: IDE Plugins and API Integrations

Core workflow: Claude embedded directly in editors via extensions (Continue.dev, Cursor, early Copilot alternatives). The context gap started closing — Claude could see open files without copy-paste.

What developers actually did:

Used inline chat to ask questions about the file currently open
Ran one-off generation tasks from the editor command palette
Started using the API to build internal tools and scripts around Claude
Experimented with system prompts to encode project context

Main frustration: IDE integrations were shallow — they saw the open file, not the repo. Claude still had no memory of what it did yesterday. Every session was a blank slate.

What Claude still sucked at:

Cross-file awareness (couldn't navigate a real codebase)
Running code to verify its own output
Taking actions (read-only, couldn't write files or run commands)
Long-running tasks that required multiple steps

The unlock that moved things forward: The API made Claude programmable. Teams started building internal tools: code review bots, PR summarizers, doc generators. Claude-as-infrastructure began.

Era 3 — Mid 2024: Claude Code Beta — CLI-Native Agentic Coding

Core workflow: claude in your terminal. Claude reads your repo, writes files, runs shell commands, iterates. The developer shifts from doing to reviewing.

What developers actually did:

Launched Claude Code from the project root — it could see and modify the entire codebase
Delegated multi-step tasks: "add auth to this Express app, write the tests, run them"
Started building CLAUDE.md files to give persistent context between sessions
Used Plan Mode (Shift+Tab twice) to review Claude's approach before execution

Main frustration: Claude would confidently execute wrong plans. Without the right upfront context, it made decisions that looked reasonable but violated project conventions. The CLAUDE.md was the fix — but writing a good one took real effort.

What Claude still sucked at:

Knowing when to stop and ask vs. when to proceed
Handling large refactors without losing track of state
Integrating with external systems (Jira, Slack, databases) natively

The unlock that moved things forward: The shift to CLI-native meant Claude could actually do things, not just suggest them. The feedback loop compressed from minutes to seconds. CLAUDE.md turned institutional knowledge into a compounding asset.

Era 4 — Late 2024: MCP — Claude Gains Tools

Core workflow: Claude connects to external systems via Model Context Protocol servers. It can now read your database, create Jira tickets, post to Slack, open a browser, and query your analytics — all within a single session.

What developers actually did:

Plugged in GitHub MCP: Claude opens PRs, reviews diffs, posts comments
Connected database MCPs: Claude queries prod (read-only) during debugging sessions
Added Slack MCP: Claude reads thread context to understand what a bug report actually means
Built custom MCP servers for internal tools

Main frustration: MCPs blow up the context window fast. Connecting five MCPs and running a complex task could exhaust context before the work was done. You had to be selective about what you enabled per session.

What Claude still sucked at:

MCP server stability (early implementations were flaky)
Security boundaries — malicious MCP responses could inject instructions
Knowing which tools to call without explicit prompting

The unlock that moved things forward: Claude stopped being a coding tool and started being an engineering workflow tool. You could describe a production incident and Claude would pull the Sentry error, read the relevant code, check the deploy history, and draft a fix — without you switching tabs once.

Era 5 — Early 2025: Parallel Sessions, Opus-with-Thinking, CLAUDE.md as Team Infra

Core workflow: Multiple Claude sessions running simultaneously, each on a different task. One refactoring a module, one writing tests, one drafting a PR description. You're a manager now, not a coder.

What developers actually did:

Ran 5 terminal sessions + 5-10 web sessions in parallel
Named tabs by task: [auth-refactor], [test-coverage], [perf-investigation]
Used system notifications as async triggers — Claude pings you when it needs input
Checked CLAUDE.md into Git — every teammate's Claude session now starts with shared institutional knowledge
Used Opus for complex architectural work, Sonnet for routine tasks

Main frustration: Parallel session management was cognitively demanding. Knowing what each session was doing, when to intervene, how to integrate outputs — that became the new developer skill.

What Claude still sucked at:

Coordination across sessions (sessions didn't know what each other were doing)
Cost management (Opus × 10 parallel sessions adds up fast)
Self-correcting when stuck in a bad plan without human intervention

The unlock that moved things forward: CLAUDE.md in Git meant the team's AI behavior was versionable, reviewable, and improvable. When Claude made a systematic mistake, you updated CLAUDE.md once and fixed it for everyone, forever.

Era 6 — Mid 2025: Agent Teams, Subagents, Autonomous Loops

Core workflow: An orchestrator Claude session spawns specialist subagent sessions. The orchestrator delegates, the subagents execute, verification agents check the output. The human sets the goal and reviews the result.

What developers actually did:

Defined orchestrator + specialist architectures for complex tasks
Ran overnight autonomous loops on well-specified tasks
Used agent-stop hooks to trigger deterministic verification after every task
Integrated Claude Code into GitHub Actions — PRs automatically got Claude review

Main frustration: Autonomous loops required near-perfect specs. Underspecified tasks + overnight runs = waking up to confident, wrong work. The quality of your prompts became the bottleneck, not Claude's capability.

What Claude still sucked at:

Knowing when it's out of its depth and should stop
Managing state across many subagent sessions cleanly
Cost predictability in open-ended autonomous tasks

The unlock that moved things forward: Verification loops. Giving Claude a way to test its own work — run the tests, open the browser, query the database — 2-3x'd output quality without additional human review.

Era 7 — Now: Claude as an Engineering Orchestration Layer

Core workflow: You describe outcomes, not steps. Claude plans the work, distributes it across subagents, verifies the results, and surfaces decisions that genuinely require human judgment.

What the best teams are doing today:

CLAUDE.md is a living document updated after every significant session
Slash commands encode team-specific workflows as reusable primitives
Claude Code GitHub Action runs on every PR — human reviewers focus on judgment, not mechanics
Custom MCP servers connect Claude to every internal tool
New engineers are onboarded to the AI-native workflow on day one

The remaining hard problems:

Who is responsible when Claude ships a bug?
How do you version AI behavior as models and prompts evolve?
How do you prevent skill atrophy in junior developers who never learn the hard way?
What work should humans always do themselves?

The Through-Line

Each era solved a different bottleneck:

Era	Bottleneck Solved	New Bottleneck Created
2023	Speed of generation	Context gap (copy-paste)
Early 2024	Context gap	No memory, no action
Mid 2024	Memory + action (CLAUDE.md + CLI)	Spec quality
Late 2024	External system integration	Context window management
Early 2025	Team knowledge sharing	Parallel session management
Mid 2025	Verification + autonomy	Spec quality at scale
Now	Orchestration	Human judgment + accountability

The pattern: every time one bottleneck is solved, the constraint moves up the stack — closer to human judgment and further from mechanical execution.

What this means for you: If you're still working like it's 2023 (copy-pasting from a chat window), you're not behind because Claude got smarter. You're behind because the workflow changed. The rest of this guide is about catching up — and then getting ahead.

← Back to Index | Next: Lesson 01 — Core Mental Model →

Outcomes

Three pillars

Workflow infrastructure beats raw capability

Specs before autonomous runs

Discoverable, team-sharable defaults

What this is

How to read it

Relationship to this site

README

Anthropic Product Experiments — PMM Portfolio

The Core Insight

Experiment Portfolio

Prioritization Framework

Spec Status Lifecycle

Naming Convention

Spec

CLAUDE.md Intelligence Agent

Purpose

The Bet

Inputs

Outputs

Agents

codebase_analyst

convention_extractor

claude_md_writer

session_monitor

Tasks

analyze_codebase

conduct_interview

write_claude_md

monitor_session_learnings

Process

Tools Required

Acceptance Criteria

Out of Scope

Open Questions

README

atlassian-sync MCP

Quick Start

Inline References

Auth

Tools (15 total)

Sync to Disk

Spec

00 Big Picture

Lesson 00: The Big Picture — A Timeline of Claude as a Dev Tool

Era 1 — 2023: The Copy-Paste Phase

Era 2 — Early 2024: IDE Plugins and API Integrations

Era 3 — Mid 2024: Claude Code Beta — CLI-Native Agentic Coding

Era 4 — Late 2024: MCP — Claude Gains Tools

Era 5 — Early 2025: Parallel Sessions, Opus-with-Thinking, CLAUDE.md as Team Infra

Era 6 — Mid 2025: Agent Teams, Subagents, Autonomous Loops

Era 7 — Now: Claude as an Engineering Orchestration Layer

The Through-Line

Related reading

Why AI Adoption Fails Without Judgment Infrastructure

AI Strategy: From Feature to Platform

Building AI Features: What PMs Need to Know

Data as a Product Requirement