Skip to content
SaaS4Builders
Architecture for AI Agents

Context Management

How the project manages AI agent context: the progressive disclosure model, skills as context capsules, lean context strategies, and performance implications for limited context windows.

AI agents have finite context windows. Every token of context you load is a token the agent cannot use for reasoning or code generation. Load too little and the agent guesses. Load too much and the agent loses focus, ignoring critical rules buried in irrelevant documentation.

SaaS4Builders solves this with progressive disclosure — a five-level system that loads the right context at the right time. An agent starting a session gets a 150-line project map. An agent working on billing gets billing-specific rules and architecture. An agent debugging a currency issue gets the full currency invariant specification. At no point does the agent receive everything at once.

This page documents the progressive disclosure model, how skills work as context capsules, strategies for keeping context lean, and how to extend the system for new features.


The Progressive Disclosure Model

Context is organized in five levels, from lightest to heaviest. Each level is loaded only when needed:

Level 0 ─── Entry Points (~15 lines)
  │          README pointers, AGENTS.md routing
  │
Level 1 ─── Project Map (~150 lines)
  │          CLAUDE.md / AGENTS.md root files
  │
Level 2 ─── Role-Specific Context (~320–540 lines)
  │          backend/CLAUDE.md, frontend/CLAUDE.md
  │
Level 3 ─── Skills (~50–80 lines each)
  │          On-demand domain expertise
  │
Level 4 ─── Architecture Blueprints (~600–1,130 lines each)
  │          TEMPLATES.md, ARCHITECTURE.md
  │
Level 5 ─── Specialist Documentation (~500–990 lines each)
             BILLING.md, CONTRACTS.md, CURRENCY-RULES.md

Level 0 — Entry Points

The lightest context layer. These files exist to route agents to the right starting point:

FileContentPurpose
AGENTS.md (root)4 routing rules + global commandsPoints Codex to the correct local AGENTS.md
docs/AGENTS.mdDocumentation-specific contextPoints to specs, workplans, migration notes

An agent reads these first and immediately knows where to look next. No domain knowledge is loaded — just navigation.

Level 1 — Project Map

The project's primary orientation layer. Loaded automatically when a session starts:

FileLinesWhat It Contains
CLAUDE.md (root)~150Project structure, Docker commands, domain entity table, critical conventions, "Do NOT" list, skill index
AGENTS.md (root)~55Standard workflow, source of truth references, global rules, main commands

These files are intentionally concise. The root CLAUDE.md is ~150 lines — enough to navigate the entire project without consuming more than a few percent of the context window. Deep knowledge lives in skills and documentation files, not here.

Key design decisions at this level:

  • Commands as a table — Not full documentation, just the make targets and what they do
  • Domain entities as a table — One-line definitions, not full schema descriptions
  • "Do NOT" list — Six critical rules that apply everywhere, always visible
  • Skill index — Lists available skills so the agent knows what it can load on demand

Level 2 — Role-Specific Context

Loaded when the agent enters a specific directory or is assigned a role:

FileLinesLoad Trigger
backend/CLAUDE.md~540Working in backend/ or assigned backend tasks
frontend/CLAUDE.md~320Working in frontend/ or assigned frontend tasks
backend/AGENTS.md~50Codex backend agent activation
frontend/AGENTS.md~55Codex frontend agent activation

These files contain all the conventions for their respective stack — architecture rules, naming conventions, test patterns, PR checklists. They are comprehensive because an agent working in the backend needs all backend rules, not just the ones relevant to the current feature.

The backend CLAUDE.md at ~540 lines is the longest role-specific file. It covers:

  • Docker environment and commands
  • Architecture layers with mandatory rules
  • Naming conventions (files, classes, methods)
  • Testing conventions (PHPUnit, factories, isolation)
  • i18n patterns
  • PR review checklist

Level 3 — Skills

On-demand context capsules loaded when the agent works on a specific domain:

Claude Code skills (14 skills in .claude/skills/):

SkillDomainTypical Lines
/docs/billingBilling architecture, Stripe, subscriptions~70
/docs/multi-tenancyTenant isolation, scoping, testing~80
/api-contractsAPI format, response conventions~60
/domain-modelEntity relationships, glossary~50
/new-featureFull-stack scaffold checklist~80
/new-api-endpointSingle endpoint creation~60
/new-migrationDatabase migration conventions~50
/write-testsPHPUnit + Vitest patterns~70
/debugSystematic debugging workflow~60
/refactorArchitecture-aligned refactoring~60
/review-codePR quality checklist~50
/troubleshootingDocker, Laravel, Nuxt common issues~70
/create-workplanInteractive milestone generator~80
/implement-wuWork unit execution (5-phase)~90

Codex skills (11 skills in .agents/skills/):

SkillDomain
billing-guardrailsBilling invariants (dedicated guardrail skill)
backend-implementationBackend coding conventions
frontend-implementationFrontend coding conventions
api-contractsAPI format conventions
reviewerCode review checklist
repo-orientationProject navigation
workplan-executionWU-driven development
implement-wuWork unit implementation
discover-feature-workplanFeature discovery process
docs-syncDocumentation drift detection
review-validationValidation and quality gates

Additional frontend-specific skills live in frontend/.agents/skills/ (e.g., nuxt-ui for the Nuxt UI component library).

Skills are the key innovation in the context management system. Each skill is ~50–90 lines — small enough to load multiple skills simultaneously, focused enough to provide actionable guidance.

Level 4 — Architecture Blueprints

Detailed architecture documents loaded when the agent needs to understand patterns in depth:

FileLinesContent
docs/architecture/backend/ARCHITECTURE.md~400Layer separation, request flow, DTO rules, provider pattern
docs/architecture/backend/TEMPLATES.md~1,130Code templates for Actions, Queries, Requests, Resources, Tests
docs/architecture/frontend/TEMPLATES.md~910Templates for schemas, API modules, composables, components
docs/architecture/backend/QUERY-BUILDER.md~600Spatie QueryBuilder filter/sort/include conventions

These are reference documents — loaded when an agent is creating something new and needs the full template, not just the rules.

Level 5 — Specialist Documentation

The deepest context layer. Loaded when working on complex domain problems:

FileLinesContent
docs/billing/BILLING.md~500Complete billing architecture, philosophy, and invariants
docs/api/CONTRACTS.md~990Full API contract specification with examples
docs/billing/CURRENCY-RULES.md~270Five non-negotiable currency rules with code examples
docs/billing/LIFECYCLE.md~270Subscription state machine with transitions
docs/billing/WEBHOOKS.md~110Webhook handling patterns and event list

An agent rarely needs all of Level 5 at once. The billing skill (Level 3) provides enough context for most billing work. Level 5 is for debugging, auditing, or implementing complex new billing features.


What Loads Automatically vs On-Demand

TriggerWhat LoadsLevel
Agent starts a sessionRoot CLAUDE.md or AGENTS.md1
Agent enters backend/backend/CLAUDE.md (added to Level 1)2
Agent enters frontend/frontend/CLAUDE.md (added to Level 1)2
Agent invokes /docs/billingBilling skill (~70 lines)3
Agent invokes /new-featureNew feature skill (~80 lines)3
Skill references @docs/billing/BILLING.mdAgent reads the full document5
Agent creating a new QueryReads TEMPLATES.md for Query template4
Agent debugging currency issueReads CURRENCY-RULES.md for full invariants5

The pattern: Levels 0–2 are automatic. Levels 3–5 are on-demand, triggered by skill invocation or explicit file reads.


The Skills Approach vs "Dump All Docs"

There are two ways to give an AI agent project context:

Approach A — Dump Everything

Load all documentation into the context window at the start of every session:

Root CLAUDE.md           150 lines
Backend CLAUDE.md        540 lines
Frontend CLAUDE.md       320 lines
BILLING.md               500 lines
CONTRACTS.md             990 lines
TEMPLATES.md (backend)  1,130 lines
TEMPLATES.md (frontend)   910 lines
QUERY-BUILDER.md          600 lines
CURRENCY-RULES.md         270 lines
LIFECYCLE.md              270 lines
─────────────────────────────────
Total                   5,640 lines (~30,000 tokens)

This consumes roughly 30,000 tokens before the agent writes a single line of code. For agents with 128K context windows, that is 23% of the window used for static context. For agents with smaller windows, it can be crippling.

Worse, the agent has no way to distinguish critical rules from background context. The billing invariants are equally weighted with the frontend template patterns. Important rules get diluted in irrelevant information.

Approach B — Progressive Disclosure via Skills

Load only what the current task needs:

Root CLAUDE.md              150 lines (automatic)
Backend CLAUDE.md           540 lines (automatic if in backend/)
/docs/billing skill               70 lines (loaded on demand)
─────────────────────────────────────
Total for billing work      760 lines (~4,000 tokens)

The same billing task uses 87% less context. The agent's attention is focused on the rules that matter for the current work. If it needs deeper context, the skill's "References" section points to the exact file.

When to Use Each

SituationApproach
Quick fix, single featureSkills (Level 3) — fast, focused
New developer onboardingStart with Level 1, explore skills as needed
Complex cross-cutting changeLevel 2 + multiple skills (Level 3)
Auditing billing complianceLevel 5 — load full billing docs
Creating a new domain from scratchLevel 4 (templates) + relevant skills
The general rule: start with the least context possible and add more only when the agent's output shows gaps. If the agent writes code that violates a convention, load the relevant skill. If the skill isn't enough, point the agent to the full documentation file.

How Skills Work as Context Capsules

Every skill follows the same anatomy — a consistent structure that packs maximum useful context into minimum lines:

---
name: <domain>
description: "<when to load this skill>"
---

# <Domain>

## Mental Model (5–10 lines)
The 30-second orientation. What this domain is, how it works at the highest level.

## Critical Rules (10–15 lines)
NEVER/ALWAYS guardrails for this domain.

## Code Architecture (5–10 lines)
Where files live, how components connect.

## Common Mistakes (5–8 lines)
Mistakes agents have actually made — proactive prevention.

## References (3–5 lines)
Links to full documentation for deep dives.

Here is how the billing skill implements this pattern:

Mental Model — Five bullet points that orient the agent: V1 is stripe_managed, Stripe is the invoicing authority, internal engine is shadow only, state changes happen via webhooks, billing uses Domain Contracts.

Critical Rules — Eight NEVER/ALWAYS rules covering invoicing authority, currency immutability, webhook triggers, SDK usage, and money representation.

Code Architecture — The call chain: Actions → Domain Contracts → Infrastructure/Stripe, with file paths for each component.

Common Mistakes — Five specific patterns agents have produced that violate the rules: generating authoritative invoices, calling Stripe SDK from Actions, hardcoding tax rates, assuming platform_managed is active, modifying subscription currency.

References — Seven links to full documentation files for deep dives: BILLING.md, ARCHITECTURE.md, LIFECYCLE.md, CURRENCY-RULES.md, WEBHOOKS.md, TAX.md, LIMITATIONS.md.

Total: ~70 lines. An agent that loads this skill has enough context to work on billing safely. If it encounters an edge case, the references point to the exact file that covers it.


Keeping Context Lean

Six strategies the project uses to minimize context size without losing critical information:

1. Reference, Don't Embed

Skills and configuration files use @docs/path/FILE.md syntax to point to full documentation without embedding it:

## References
- @docs/billing/BILLING.md — Architecture & philosophy (source of truth)
- @docs/billing/CURRENCY-RULES.md — Currency invariants

The agent reads the reference only if it needs the full document. Most of the time, the skill's summary is sufficient.

2. Tables Over Prose

Compare these two ways to document the same information:

Prose (5 lines):

The billing system supports three pricing types. Flat-rate pricing charges
a fixed amount per billing period. Seat-based pricing charges per team
member. Usage-based pricing charges based on metered consumption. The
pricing type is defined by the PricingType enum which has three values:
flat, seat, and usage.

Table (5 lines):

| Type | Charges | Enum Value |
|------|---------|------------|
| Flat | Fixed per period | `flat` |
| Seat | Per team member | `seat` |
| Usage | Per metered unit | `usage` |

Same information, but the table is scannable in one pass. AI agents process structured data more reliably than prose paragraphs.

3. One Rule Per Line

Each guardrail rule is a single line with a clear verb:

- NEVER store money as floats
- ALWAYS use Money value object

Not:

- When handling monetary values, it's important to remember that floats
  can introduce rounding errors, so you should always use the Money value
  object which stores amounts as integer cents...

The one-line format is unambiguous. The agent either follows the rule or violates it — there is no interpretation required.

4. Common Mistakes as Proactive Context

Instead of loading documentation about everything that can go wrong, each skill includes a "Common Mistakes" section with the 5 most likely errors:

## Common Mistakes

- Generating internal invoices as if they were authoritative (V1 = shadow only)
- Calling Stripe classes directly from Actions instead of through PaymentGateway contract
- Hardcoding tax rates instead of using TaxProviderInterface

These 3 lines prevent more bugs than 100 lines of explanatory prose because they describe the exact error patterns agents produce.

5. Diagrams Over Flow Descriptions

Architecture flows use compact notation:

Request → FormRequest::toDto() → Action → Domain Contract → Provider → Response

This one line replaces a multi-paragraph description of the same flow. ASCII diagrams in skills and documentation files follow this principle — maximum information density with minimum tokens.

6. Structured TOC in Root Files

The root CLAUDE.md uses tables and short lists instead of paragraphs:

| Entity | Description |
|--------|-------------|
| Tenant | Customer org — all resources scoped by tenant_id (UUID) |
| User   | Individual account, single tenant (int ID, nullable tenant_id) |

An agent scanning this table can locate any entity in seconds. The equivalent prose description would be three times longer and harder to parse.


Claude Code vs Codex Context Strategies

Both Claude Code and Codex can work on this project, but they load context differently:

AspectClaude CodeCodex
Entry pointCLAUDE.md (auto-loaded)AGENTS.md (manual read)
Config directory.claude/.codex/agents/ + .agents/skills/
Role-specific contextbackend/CLAUDE.md, frontend/CLAUDE.mdbackend/AGENTS.md, frontend/AGENTS.md
Skills14 skills via /skill-name11 skills via @.agents/skills/ references
Skill invocationSlash command (/docs/billing)File reference in agent config
MCP serversSupported (Laravel Boost, Context7)Not supported — use CLI tools
HooksAuto-format on Write/Edit (Pint, ESLint)Git hooks (pre-commit)
Agent profilesBuilt into skill systemTOML files (.codex/agents/*.toml)

Key Difference: Skill Loading

Claude Code loads skills on demand when you type /docs/billing. The agent receives the skill content inline with the current conversation — the context is additive.

Codex references skills in agent configuration. The backend-dev.toml agent profile embeds skill references in the developer instructions:

.codex/agents/backend-dev.toml (excerpt)
name = "backend-dev"
description = "Backend-only Laravel agent for API endpoints, Actions, Queries..."
model_reasoning_effort = "medium"
developer_instructions = """
Before editing:
- Read backend/AGENTS.md.
- Read docs/api/CONTRACTS.md and docs/api/ENDPOINTS.md for endpoint or payload work.
- Read docs/architecture/backend/ARCHITECTURE.md.
- Read docs/billing/BILLING.md for any billing-related change.
- Load the project skills backend-implementation, billing-guardrails,
  and api-contracts when relevant.

Execution rules:
- Keep controllers thin.
- Follow Request -> DTO -> Action or Query -> Resource.
- Do not invent undocumented endpoints or response shapes.
- Do not bypass tenancy, billing, or webhook guardrails.
"""

The context loaded is the same — but the mechanism differs. Claude Code is interactive (load skills as needed during the session). Codex is declarative (the agent reads skills listed in its instructions when it starts).

Bridging Both Agents

The project maintains parallel context systems because Claude Code and Codex have different strengths:

  • Claude Code excels at interactive development — exploring code, making changes, running tests, iterating
  • Codex excels at planned execution — following workplans, implementing features from specs, generating code in a sandbox

The convention documents (Level 4–5) are shared by both. Only the entry points (Level 0–1) and skill packaging (Level 3) differ.


Extending Context for New Features

When you add a new domain to the project, extend the context system at the appropriate levels:

1. Add Terms to the Domain Glossary

Add canonical terminology to ai-context/DOMAIN-GLOSSARY.md:

**Notification**
A message sent to a user via push, email, or SMS — tracked with delivery status.

**Channel**
The delivery mechanism for a notification (push, email, SMS).

This ensures agents use the correct terms from the first prompt.

2. Create a Domain Skill

Create a skill in both .claude/skills/ and .agents/skills/:

.claude/skills/docs/notifications/SKILL.md    # Claude Code
.agents/skills/docs/notifications/SKILL.md    # Codex

Follow the standard skill anatomy: Mental Model → Critical Rules → Code Architecture → Common Mistakes → References.

3. Update CLAUDE.md If Needed

If the new domain has critical rules that apply globally (like "never send notifications without idempotency keys"), add them to the "Do NOT" section of root CLAUDE.md:

## Do NOT
- ...existing rules...
- Send notifications without idempotency keys

Only add rules here if they are truly global. Domain-specific rules belong in the skill.

4. Create Architecture Documentation

If the domain has complex patterns (like billing), create a documentation file:

docs/notifications/NOTIFICATIONS.md

Reference this from the skill's "References" section. This file becomes the source of truth for the domain's architecture — the skill is the summary, the doc is the full specification.

5. Register the Skill

Add the new skill to the skill index in root CLAUDE.md:

**Available skills:**
`/new-feature` · `/docs/billing` · `/docs/multi-tenancy` · ... · `/docs/notifications`

And reference it in the relevant Codex agent TOML if the agent should pre-load it.


Performance Implications

Context management directly affects AI agent performance. Here are the trade-offs:

StrategyToken CostAgent QualityWhen to Use
Level 1 only (root CLAUDE.md)~800 tokensBroad orientation, may miss domain rulesQuick questions, navigation
Level 1 + 2 (root + role-specific)~3,500 tokensFull stack conventions, no domain depthGeneral development
Level 1 + 2 + 1 skill~4,000 tokensFocused domain expertiseFeature work in one domain
Level 1 + 2 + 3 skills~5,000 tokensMulti-domain awarenessCross-cutting changes
All levels loaded~30,000 tokensComprehensive but dilutedAudits, architecture reviews

Rules of Thumb

  1. Start lean, add as needed — Begin with automatic context (Levels 0–2). Add skills only when the agent's output shows gaps in domain knowledge.
  2. One skill per domain — If you're working on billing, load /docs/billing. Don't also load /api-contracts and /docs/multi-tenancy unless the task touches those domains.
  3. Skills before full docs — Always try the skill first (~70 lines). Only load the full documentation (~500+ lines) if the skill's summary isn't sufficient for the specific task.
  4. Monitor context pressure — If an agent starts forgetting earlier instructions or producing inconsistent output, you have loaded too much context. Remove the least relevant skill or document.
  5. Codex pre-loads, Claude Code adds — Codex agents should pre-load the skills they always need (via TOML config). Claude Code users should add skills interactively as the task evolves.
The progressive disclosure model works because most tasks only need 2–3 levels of context. A billing feature needs Level 1 (project map) + Level 2 (backend rules) + Level 3 (billing skill) = ~4,000 tokens. That leaves 96% of a 128K context window for reasoning and code generation.

What's Next