Architecture for AI Agents

Context Management

How the project manages AI agent context: the progressive disclosure model, skills as context capsules, lean context strategies, and performance implications for limited context windows.

AI agents have finite context windows. Every token of context you load is a token the agent cannot use for reasoning or code generation. Load too little and the agent guesses. Load too much and the agent loses focus, ignoring critical rules buried in irrelevant documentation.

SaaS4Builders solves this with progressive disclosure — a five-level system that loads the right context at the right time. An agent starting a session gets a 150-line project map. An agent working on billing gets billing-specific rules and architecture. An agent debugging a currency issue gets the full currency invariant specification. At no point does the agent receive everything at once.

This page documents the progressive disclosure model, how skills work as context capsules, strategies for keeping context lean, and how to extend the system for new features.

The Progressive Disclosure Model

Context is organized in five levels, from lightest to heaviest. Each level is loaded only when needed:

Level 0 ─── Entry Points (~15 lines)
  │          README pointers, AGENTS.md routing
  │
Level 1 ─── Project Map (~150 lines)
  │          CLAUDE.md / AGENTS.md root files
  │
Level 2 ─── Role-Specific Context (~320–540 lines)
  │          backend/CLAUDE.md, frontend/CLAUDE.md
  │
Level 3 ─── Skills (~50–80 lines each)
  │          On-demand domain expertise
  │
Level 4 ─── Architecture Blueprints (~600–1,130 lines each)
  │          TEMPLATES.md, ARCHITECTURE.md
  │
Level 5 ─── Specialist Documentation (~500–990 lines each)
             BILLING.md, CONTRACTS.md, CURRENCY-RULES.md

Level 0 — Entry Points

The lightest context layer. These files exist to route agents to the right starting point:

File	Content	Purpose
`AGENTS.md` (root)	4 routing rules + global commands	Points Codex to the correct local `AGENTS.md`
`docs/AGENTS.md`	Documentation-specific context	Points to specs, workplans, migration notes

An agent reads these first and immediately knows where to look next. No domain knowledge is loaded — just navigation.

Level 1 — Project Map

The project's primary orientation layer. Loaded automatically when a session starts:

File	Lines	What It Contains
`CLAUDE.md` (root)	~150	Project structure, Docker commands, domain entity table, critical conventions, "Do NOT" list, skill index
`AGENTS.md` (root)	~55	Standard workflow, source of truth references, global rules, main commands

These files are intentionally concise. The root CLAUDE.md is ~150 lines — enough to navigate the entire project without consuming more than a few percent of the context window. Deep knowledge lives in skills and documentation files, not here.

Key design decisions at this level:

Commands as a table — Not full documentation, just the make targets and what they do
Domain entities as a table — One-line definitions, not full schema descriptions
"Do NOT" list — Six critical rules that apply everywhere, always visible
Skill index — Lists available skills so the agent knows what it can load on demand

Level 2 — Role-Specific Context

Loaded when the agent enters a specific directory or is assigned a role:

File	Lines	Load Trigger
`backend/CLAUDE.md`	~540	Working in `backend/` or assigned backend tasks
`frontend/CLAUDE.md`	~320	Working in `frontend/` or assigned frontend tasks
`backend/AGENTS.md`	~50	Codex backend agent activation
`frontend/AGENTS.md`	~55	Codex frontend agent activation

These files contain all the conventions for their respective stack — architecture rules, naming conventions, test patterns, PR checklists. They are comprehensive because an agent working in the backend needs all backend rules, not just the ones relevant to the current feature.

The backend CLAUDE.md at ~540 lines is the longest role-specific file. It covers:

Docker environment and commands
Architecture layers with mandatory rules
Naming conventions (files, classes, methods)
Testing conventions (PHPUnit, factories, isolation)
i18n patterns
PR review checklist

Level 3 — Skills

On-demand context capsules loaded when the agent works on a specific domain:

Claude Code skills (14 skills in .claude/skills/):

Skill	Domain	Typical Lines
`/docs/billing`	Billing architecture, Stripe, subscriptions	~70
`/docs/multi-tenancy`	Tenant isolation, scoping, testing	~80
`/api-contracts`	API format, response conventions	~60
`/domain-model`	Entity relationships, glossary	~50
`/new-feature`	Full-stack scaffold checklist	~80
`/new-api-endpoint`	Single endpoint creation	~60
`/new-migration`	Database migration conventions	~50
`/write-tests`	PHPUnit + Vitest patterns	~70
`/debug`	Systematic debugging workflow	~60
`/refactor`	Architecture-aligned refactoring	~60
`/review-code`	PR quality checklist	~50
`/troubleshooting`	Docker, Laravel, Nuxt common issues	~70
`/create-workplan`	Interactive milestone generator	~80
`/implement-wu`	Work unit execution (5-phase)	~90

Codex skills (11 skills in .agents/skills/):

Skill	Domain
`billing-guardrails`	Billing invariants (dedicated guardrail skill)
`backend-implementation`	Backend coding conventions
`frontend-implementation`	Frontend coding conventions
`api-contracts`	API format conventions
`reviewer`	Code review checklist
`repo-orientation`	Project navigation
`workplan-execution`	WU-driven development
`implement-wu`	Work unit implementation
`discover-feature-workplan`	Feature discovery process
`docs-sync`	Documentation drift detection
`review-validation`	Validation and quality gates

Additional frontend-specific skills live in frontend/.agents/skills/ (e.g., nuxt-ui for the Nuxt UI component library).

Skills are the key innovation in the context management system. Each skill is ~50–90 lines — small enough to load multiple skills simultaneously, focused enough to provide actionable guidance.

Level 4 — Architecture Blueprints

Detailed architecture documents loaded when the agent needs to understand patterns in depth:

File	Lines	Content
`docs/architecture/backend/ARCHITECTURE.md`	~400	Layer separation, request flow, DTO rules, provider pattern
`docs/architecture/backend/TEMPLATES.md`	~1,130	Code templates for Actions, Queries, Requests, Resources, Tests
`docs/architecture/frontend/TEMPLATES.md`	~910	Templates for schemas, API modules, composables, components
`docs/architecture/backend/QUERY-BUILDER.md`	~600	Spatie QueryBuilder filter/sort/include conventions

These are reference documents — loaded when an agent is creating something new and needs the full template, not just the rules.

Level 5 — Specialist Documentation

The deepest context layer. Loaded when working on complex domain problems:

File	Lines	Content
`docs/billing/BILLING.md`	~500	Complete billing architecture, philosophy, and invariants
`docs/api/CONTRACTS.md`	~990	Full API contract specification with examples
`docs/billing/CURRENCY-RULES.md`	~270	Five non-negotiable currency rules with code examples
`docs/billing/LIFECYCLE.md`	~270	Subscription state machine with transitions
`docs/billing/WEBHOOKS.md`	~110	Webhook handling patterns and event list

An agent rarely needs all of Level 5 at once. The billing skill (Level 3) provides enough context for most billing work. Level 5 is for debugging, auditing, or implementing complex new billing features.

What Loads Automatically vs On-Demand

Trigger	What Loads	Level
Agent starts a session	Root `CLAUDE.md` or `AGENTS.md`	1
Agent enters `backend/`	`backend/CLAUDE.md` (added to Level 1)	2
Agent enters `frontend/`	`frontend/CLAUDE.md` (added to Level 1)	2
Agent invokes `/docs/billing`	Billing skill (~70 lines)	3
Agent invokes `/new-feature`	New feature skill (~80 lines)	3
Skill references `@docs/billing/BILLING.md`	Agent reads the full document	5
Agent creating a new Query	Reads `TEMPLATES.md` for Query template	4
Agent debugging currency issue	Reads `CURRENCY-RULES.md` for full invariants	5

The pattern: Levels 0–2 are automatic. Levels 3–5 are on-demand, triggered by skill invocation or explicit file reads.

The Skills Approach vs "Dump All Docs"

There are two ways to give an AI agent project context:

Approach A — Dump Everything

Load all documentation into the context window at the start of every session:

Root CLAUDE.md           150 lines
Backend CLAUDE.md        540 lines
Frontend CLAUDE.md       320 lines
BILLING.md               500 lines
CONTRACTS.md             990 lines
TEMPLATES.md (backend)  1,130 lines
TEMPLATES.md (frontend)   910 lines
QUERY-BUILDER.md          600 lines
CURRENCY-RULES.md         270 lines
LIFECYCLE.md              270 lines
─────────────────────────────────
Total                   5,640 lines (~30,000 tokens)

This consumes roughly 30,000 tokens before the agent writes a single line of code. For agents with 128K context windows, that is 23% of the window used for static context. For agents with smaller windows, it can be crippling.

Worse, the agent has no way to distinguish critical rules from background context. The billing invariants are equally weighted with the frontend template patterns. Important rules get diluted in irrelevant information.

Approach B — Progressive Disclosure via Skills

Load only what the current task needs:

Root CLAUDE.md              150 lines (automatic)
Backend CLAUDE.md           540 lines (automatic if in backend/)
/docs/billing skill               70 lines (loaded on demand)
─────────────────────────────────────
Total for billing work      760 lines (~4,000 tokens)

The same billing task uses 87% less context. The agent's attention is focused on the rules that matter for the current work. If it needs deeper context, the skill's "References" section points to the exact file.

When to Use Each

Situation	Approach
Quick fix, single feature	Skills (Level 3) — fast, focused
New developer onboarding	Start with Level 1, explore skills as needed
Complex cross-cutting change	Level 2 + multiple skills (Level 3)
Auditing billing compliance	Level 5 — load full billing docs
Creating a new domain from scratch	Level 4 (templates) + relevant skills

The general rule: start with the least context possible and add more only when the agent's output shows gaps. If the agent writes code that violates a convention, load the relevant skill. If the skill isn't enough, point the agent to the full documentation file.

How Skills Work as Context Capsules

Every skill follows the same anatomy — a consistent structure that packs maximum useful context into minimum lines:

---
name: <domain>
description: "<when to load this skill>"
---

# <Domain>

## Mental Model (5–10 lines)
The 30-second orientation. What this domain is, how it works at the highest level.

## Critical Rules (10–15 lines)
NEVER/ALWAYS guardrails for this domain.

## Code Architecture (5–10 lines)
Where files live, how components connect.

## Common Mistakes (5–8 lines)
Mistakes agents have actually made — proactive prevention.

## References (3–5 lines)
Links to full documentation for deep dives.

Here is how the billing skill implements this pattern:

Mental Model — Five bullet points that orient the agent: V1 is stripe_managed, Stripe is the invoicing authority, internal engine is shadow only, state changes happen via webhooks, billing uses Domain Contracts.

Critical Rules — Eight NEVER/ALWAYS rules covering invoicing authority, currency immutability, webhook triggers, SDK usage, and money representation.

Code Architecture — The call chain: Actions → Domain Contracts → Infrastructure/Stripe, with file paths for each component.

Common Mistakes — Five specific patterns agents have produced that violate the rules: generating authoritative invoices, calling Stripe SDK from Actions, hardcoding tax rates, assuming platform_managed is active, modifying subscription currency.

References — Seven links to full documentation files for deep dives: BILLING.md, ARCHITECTURE.md, LIFECYCLE.md, CURRENCY-RULES.md, WEBHOOKS.md, TAX.md, LIMITATIONS.md.

Total: ~70 lines. An agent that loads this skill has enough context to work on billing safely. If it encounters an edge case, the references point to the exact file that covers it.

Keeping Context Lean

Six strategies the project uses to minimize context size without losing critical information:

1. Reference, Don't Embed

Skills and configuration files use @docs/path/FILE.md syntax to point to full documentation without embedding it:

## References
- @docs/billing/BILLING.md — Architecture & philosophy (source of truth)
- @docs/billing/CURRENCY-RULES.md — Currency invariants

The agent reads the reference only if it needs the full document. Most of the time, the skill's summary is sufficient.

2. Tables Over Prose

Compare these two ways to document the same information:

Prose (5 lines):

The billing system supports three pricing types. Flat-rate pricing charges
a fixed amount per billing period. Seat-based pricing charges per team
member. Usage-based pricing charges based on metered consumption. The
pricing type is defined by the PricingType enum which has three values:
flat, seat, and usage.

Table (5 lines):

| Type | Charges | Enum Value |
|------|---------|------------|
| Flat | Fixed per period | `flat` |
| Seat | Per team member | `seat` |
| Usage | Per metered unit | `usage` |

Same information, but the table is scannable in one pass. AI agents process structured data more reliably than prose paragraphs.

3. One Rule Per Line

Each guardrail rule is a single line with a clear verb:

- NEVER store money as floats
- ALWAYS use Money value object

Not:

- When handling monetary values, it's important to remember that floats
  can introduce rounding errors, so you should always use the Money value
  object which stores amounts as integer cents...

The one-line format is unambiguous. The agent either follows the rule or violates it — there is no interpretation required.

4. Common Mistakes as Proactive Context

Instead of loading documentation about everything that can go wrong, each skill includes a "Common Mistakes" section with the 5 most likely errors:

## Common Mistakes

- Generating internal invoices as if they were authoritative (V1 = shadow only)
- Calling Stripe classes directly from Actions instead of through PaymentGateway contract
- Hardcoding tax rates instead of using TaxProviderInterface

These 3 lines prevent more bugs than 100 lines of explanatory prose because they describe the exact error patterns agents produce.

5. Diagrams Over Flow Descriptions

Architecture flows use compact notation:

Request → FormRequest::toDto() → Action → Domain Contract → Provider → Response

This one line replaces a multi-paragraph description of the same flow. ASCII diagrams in skills and documentation files follow this principle — maximum information density with minimum tokens.

6. Structured TOC in Root Files

The root CLAUDE.md uses tables and short lists instead of paragraphs:

| Entity | Description |
|--------|-------------|
| Tenant | Customer org — all resources scoped by tenant_id (UUID) |
| User   | Individual account, single tenant (int ID, nullable tenant_id) |

An agent scanning this table can locate any entity in seconds. The equivalent prose description would be three times longer and harder to parse.

Claude Code vs Codex Context Strategies

Both Claude Code and Codex can work on this project, but they load context differently:

Aspect	Claude Code	Codex
Entry point	`CLAUDE.md` (auto-loaded)	`AGENTS.md` (manual read)
Config directory	`.claude/`	`.codex/agents/` + `.agents/skills/`
Role-specific context	`backend/CLAUDE.md`, `frontend/CLAUDE.md`	`backend/AGENTS.md`, `frontend/AGENTS.md`
Skills	14 skills via `/skill-name`	11 skills via `@.agents/skills/` references
Skill invocation	Slash command (`/docs/billing`)	File reference in agent config
MCP servers	Supported (Laravel Boost, Context7)	Not supported — use CLI tools
Hooks	Auto-format on Write/Edit (Pint, ESLint)	Git hooks (`pre-commit`)
Agent profiles	Built into skill system	TOML files (`.codex/agents/*.toml`)

Key Difference: Skill Loading

Claude Code loads skills on demand when you type /docs/billing. The agent receives the skill content inline with the current conversation — the context is additive.

Codex references skills in agent configuration. The backend-dev.toml agent profile embeds skill references in the developer instructions:

.codex/agents/backend-dev.toml (excerpt)

name = "backend-dev"
description = "Backend-only Laravel agent for API endpoints, Actions, Queries..."
model_reasoning_effort = "medium"
developer_instructions = """
Before editing:
- Read backend/AGENTS.md.
- Read docs/api/CONTRACTS.md and docs/api/ENDPOINTS.md for endpoint or payload work.
- Read docs/architecture/backend/ARCHITECTURE.md.
- Read docs/billing/BILLING.md for any billing-related change.
- Load the project skills backend-implementation, billing-guardrails,
  and api-contracts when relevant.

Execution rules:
- Keep controllers thin.
- Follow Request -> DTO -> Action or Query -> Resource.
- Do not invent undocumented endpoints or response shapes.
- Do not bypass tenancy, billing, or webhook guardrails.
"""

The context loaded is the same — but the mechanism differs. Claude Code is interactive (load skills as needed during the session). Codex is declarative (the agent reads skills listed in its instructions when it starts).

Bridging Both Agents

The project maintains parallel context systems because Claude Code and Codex have different strengths:

Claude Code excels at interactive development — exploring code, making changes, running tests, iterating
Codex excels at planned execution — following workplans, implementing features from specs, generating code in a sandbox

The convention documents (Level 4–5) are shared by both. Only the entry points (Level 0–1) and skill packaging (Level 3) differ.

Extending Context for New Features

When you add a new domain to the project, extend the context system at the appropriate levels:

1. Add Terms to the Domain Glossary

Add canonical terminology to ai-context/DOMAIN-GLOSSARY.md:

**Notification**
A message sent to a user via push, email, or SMS — tracked with delivery status.

**Channel**
The delivery mechanism for a notification (push, email, SMS).

This ensures agents use the correct terms from the first prompt.

2. Create a Domain Skill

Create a skill in both .claude/skills/ and .agents/skills/:

.claude/skills/docs/notifications/SKILL.md    # Claude Code
.agents/skills/docs/notifications/SKILL.md    # Codex

Follow the standard skill anatomy: Mental Model → Critical Rules → Code Architecture → Common Mistakes → References.

3. Update CLAUDE.md If Needed

If the new domain has critical rules that apply globally (like "never send notifications without idempotency keys"), add them to the "Do NOT" section of root CLAUDE.md:

## Do NOT
- ...existing rules...
- Send notifications without idempotency keys

Only add rules here if they are truly global. Domain-specific rules belong in the skill.

4. Create Architecture Documentation

If the domain has complex patterns (like billing), create a documentation file:

docs/notifications/NOTIFICATIONS.md

Reference this from the skill's "References" section. This file becomes the source of truth for the domain's architecture — the skill is the summary, the doc is the full specification.

5. Register the Skill

Add the new skill to the skill index in root CLAUDE.md:

**Available skills:**
`/new-feature` · `/docs/billing` · `/docs/multi-tenancy` · ... · `/docs/notifications`

And reference it in the relevant Codex agent TOML if the agent should pre-load it.

Performance Implications

Context management directly affects AI agent performance. Here are the trade-offs:

Strategy	Token Cost	Agent Quality	When to Use
Level 1 only (root CLAUDE.md)	~800 tokens	Broad orientation, may miss domain rules	Quick questions, navigation
Level 1 + 2 (root + role-specific)	~3,500 tokens	Full stack conventions, no domain depth	General development
Level 1 + 2 + 1 skill	~4,000 tokens	Focused domain expertise	Feature work in one domain
Level 1 + 2 + 3 skills	~5,000 tokens	Multi-domain awareness	Cross-cutting changes
All levels loaded	~30,000 tokens	Comprehensive but diluted	Audits, architecture reviews

Rules of Thumb

Start lean, add as needed — Begin with automatic context (Levels 0–2). Add skills only when the agent's output shows gaps in domain knowledge.
One skill per domain — If you're working on billing, load /docs/billing. Don't also load /api-contracts and /docs/multi-tenancy unless the task touches those domains.
Skills before full docs — Always try the skill first (~70 lines). Only load the full documentation (~500+ lines) if the skill's summary isn't sufficient for the specific task.
Monitor context pressure — If an agent starts forgetting earlier instructions or producing inconsistent output, you have loaded too much context. Remove the least relevant skill or document.
Codex pre-loads, Claude Code adds — Codex agents should pre-load the skills they always need (via TOML config). Claude Code users should add skills interactively as the task evolves.

The progressive disclosure model works because most tasks only need 2–3 levels of context. A billing feature needs Level 1 (project map) + Level 2 (backend rules) + Level 3 (billing skill) = ~4,000 tokens. That leaves 96% of a 128K context window for reasoning and code generation.

What's Next

AI-Assisted Development Overview — The three pillars framework that this context system implements
CLAUDE.md Configuration — How the Level 1–2 configuration files are structured
Skills System — Deep dive into building and using Level 3 skills
AI Guardrails — The rules that skills load as context
Convention Files — The Level 4 architecture documents

AI Guardrails

The mandatory rules pattern that prevents AI agents from breaking architectural invariants: billing, tenancy, API contracts, and how to write guardrails for your own features.

Writing Custom Skills

Advanced patterns for creating custom Claude Code skills: multi-phase workflows, interactive discovery, diagnostic skills, agent integration, and design principles for effective AI-assisted development.