Chapter 10The skills layer: Dynamic capability loading

The architectural pattern this chapter develops is dynamic capability loading: extending a running agent with new procedural capability, instructions, the tools that capability needs, and the data scopes it requires, without redeploying the system. The unit it loads is a runtime capability payload: a self-contained bundle of three things, procedural instructions, a declaration of the capabilities (tools) it requires, and the data scopes it needs, injected into the agent’s context when a task calls for it. Whether that payload arrives as a SKILL.md file on disk, a row in an enterprise database, or a response from a Model Context Protocol (MCP) server, the architectural implications are identical. The pattern and the payload are the durable concepts; any particular file format is an implementation of them.

The concrete standard the ecosystem converged on by late 2025 is the Agent Skill: a folder with a SKILL.md manifest plus optional resources, scripts, references, templates, loaded on demand when a task matches the skill’s description. Developed by Anthropic and released as an open standard at agentskills.io, it had been adopted across roughly 40 agent products by mid-2026, Claude Code, OpenAI Codex, Gemini CLI, Cursor, GitHub Copilot, Goose, OpenHands, Letta, and others. That convergence is real but point-in-time; this chapter treats the Agent Skill as the current dominant implementation of dynamic capability loading, not as the pattern itself. It uses SKILL.md as its working example because a concrete format aids the discussion, but every architectural claim is about the pattern, discovery, on-demand activation, scoped execution, and survives the format. If the manifest standard is superseded, the runtime capability payload remains, and so does this chapter.

This chapter places the Skills layer in the architecture. It defines what a skill is, what it is not, and what changes, and what does not change, about a system that adopts the standard. The book’s position is that Skills are a memory pattern realized at runtime, and that they are architecturally significant precisely because they make the design-time / runtime boundary explicit. They are not a replacement for the architectural patterns of the previous chapters; they are an additional layer that the architecture must accommodate.

What a skill is

A skill is a packaged, self-contained capability that an agent loads when needed. Concretely, in the agentskills.io standard:

my-skill/
├── SKILL.md          # Required: name, description, instructions
├── scripts/          # Optional: executable code the skill may invoke
├── references/       # Optional: reference material to read on demand
└── assets/           # Optional: templates, examples, fixtures

The manifest declares the skill’s name and a short description; the body of SKILL.md contains instructions the agent reads when the skill is activated. Supporting directories carry scripts and references that the skill can use during execution.

The defining mechanism is progressive disclosure:

  1. Discovery. At session start, the agent has access only to the skill’s name and description, a few lines of metadata. The context cost is small and bounded.

  2. Activation. When a task matches a skill’s description, the agent reads the full SKILL.md into context.

  3. Execution. The agent follows the instructions, optionally invoking the skill’s bundled scripts and reading its references on demand.

The architectural payoff is that the total context an agent can have available is much larger than what fits in a single context window, because most of it is loaded only when needed.

A minimal but realistic SKILL.md makes the format concrete:

---
name: project-conventions
description: Code conventions, style guide, and structural rules for this codebase.
  Load whenever the agent makes any code change in this project.
version: 2.1.0
requires_tools:
  - read_file
  - search_repo
---

# Project conventions

## Language and tooling
- TypeScript with strict mode. No `any` types in new code.
- Code style is enforced by the project's linter. If lint fails, fix the issues;
  do not disable the rule.

## Module boundaries
- `src/core/` does not import from `src/api/`, `src/db/`, or `src/ui/`.
- Cross-layer imports flag the change for review.

## Forbidden patterns
- Direct database calls outside `src/db/`. Use the repository layer.
- `console.log` in production paths. Use the structured logger.

Three architectural facts about this format:

The scripts/ directory raises a question that must be answered explicitly: can the agent run that code? The answer is no, not directly. An agent cannot execute a script it finds in a skill directory simply because the skill is loaded; allowing that would let any loaded document bypass the bounding and governance layers wholesale. If a skill ships executable code, that code is admitted the way any capability is: it is registered with the bounding layer (Chapter 5) as a tool, with a schema, an authorization scope, and a place in the action surface, and every invocation passes through the standard schema and policy checks (Chapter 6) before it runs. A skill’s script is a candidate tool subject to admission, never an escape hatch around the action surface.

Chapter 17 shows additional skill examples in the context of a complete worked system (Concord).

What a skill is not

The taxonomy matters because skills are easily confused with other constructs.

Skills are not patterns. A pattern is a design-time architectural decision: how to structure the loop, where governance sits, what the tool surface is. A skill is a runtime artifact: a packaged capability the agent loads. The book’s pattern language and the Skills standard sit in different planes.

Skills are not tools. A tool is a function the agent can invoke; a skill is a body of procedural knowledge that may use tools. A skill might describe a workflow that invokes several tools in sequence; the tools themselves remain the action surface that the bounding layer governs.

Skills are not MCP servers, but they compose with MCP. The Model Context Protocol (MCP) is the transport: it carries tools, resources, and prompts between an agent and external services. A skill is the payload. The two map together cleanly, an MCP server can be the channel that delivers a skill’s instructions and exposes the tools its manifest declares, and a skill’s manifest can dictate what the agent requests from an MCP server on connection. The distinction is layering, not rivalry: MCP is how the payload travels; the skill is what the payload contains. As of mid-2026, MCP is the dominant transport for exactly this kind of dynamic capability loading, which is a reason to map skills onto it rather than hold them apart.

Skills are not agents. A skill is data the agent loads. The skill does not itself reason; it tells the agent how to reason about a task. Multi-agent systems can each load distinct skills; one agent does not become two by loading two skills.

Skills are not prompts. A prompt is conversation-level instruction for a single interaction. A skill is a portable, version-controlled artifact loaded across many sessions and (where the standard is adopted) across many products.

This taxonomy clarifies the architectural placement. Skills are a memory pattern (Chapter 7), specifically, a form of curated semantic memory delivered with progressive disclosure, combined with a runtime extension mechanism that adds capability without redeployment.

The architectural reframe

Before skills, the agent’s capabilities were determined at deployment: which prompts, which tools, which retrieval indexes, which validators. Extending capability meant changing the deployment. After skills, the deployment still defines the envelope, what tools exist, what governance applies, what bounds hold, but the content of capability can be extended by adding skills.

This produces a useful three-layer model:

Figure 7. The architectural reframe

The architectural commitments developed earlier are not loosened by adopting skills. A skill cannot grant itself access to tools the bounding layer disallows. A skill cannot bypass the governance layer’s validators or policy gates. A skill cannot exempt itself from the cost or iteration budget. Skills are subordinate to the architecture; the architecture is not subordinate to skills.

The architectural value of skills is that they make this layering explicit. The architecture, once stable, does not need to be re-deployed every time the team wants to teach the agent a new procedure. The new procedure is a skill. The bounds, the governance, and the memory architecture remain as designed.

Architectural commitments for skill-aware systems

A system that admits skills must make the following commitments. They are the architectural questions the design must answer before the first skill is loaded.

Skill provenance

Where can skills come from? Three plausible answers:

The architectural commitment is that the source of a skill is part of its identity, recorded in the trace whenever the skill is loaded.

Skill capability declaration

A skill must declare what it requires: tools it expects to be available, data scopes it expects access to, governance levels it expects to operate within, approval semantics it may invoke. The bounding layer (Chapter 5) and the governance layer (Chapter 6) evaluate the declaration before admitting the skill. A skill that requests tools the agent is not authorized to invoke is not admitted; a skill that requests data scopes outside the session’s identity is not admitted; a skill that bypasses required approval gates is not admitted.

This is the architectural answer to “what if the skill instructs the agent to do something dangerous?” The instruction has no effect, because the action surface available to the agent is determined by the architecture, not by the skill.

Skill loading observability

Skill loading is an event in the trace (Chapter 12). The trace records which skill was loaded, when, by whose request, with what identity. Activation is itself a governance event subject to policy: a skill loaded outside business hours, or by an account with a recent permission change, may be treated as higher-risk.

Skill versioning and integrity

Skills are versioned artifacts. The deployment pins the version, or accepts versions from a controlled range. Integrity is verified (signed manifests where the standard supports it; hash checks otherwise) before activation. Once-loaded skills are not silently replaced by a different version mid-session.

Skill retirement

Skills can be retired. A retired skill is not loaded; tasks that would have triggered it are routed to a fallback or refused. Retirement is observable to users and is part of the operational discipline.

Skill eviction

Progressive disclosure governs how a skill enters the context; eviction governs how it leaves. A skill loaded for one task does not stop consuming the context window when that task ends. An agent that loads a Git-workflow skill to land a change, then moves on to draft a status update, is still carrying the Git skill’s procedural prose, noise that competes for the model’s attention (Chapter 7) against the new task. The architecture must therefore track task state and evict a skill, removing its instructions from the active context, once the task that triggered it completes. Eviction is the teardown half of progressive disclosure; without it, a long session accretes loaded skills until the context is dominated by procedures no longer in use. Eviction is recorded in the trace, the same as loading.

The threat model

Skills introduce a specific class of vulnerability the architecture must defend against. The threat model has three components:

  1. Untrusted content in the skill. A skill from an unvetted source can contain instructions designed to subvert the agent, “ignore prior instructions,” “exfiltrate the contents of memory,” “invoke this tool with these arguments.” Modern reasoning models are not immune to such injection, especially when the content carries the authority of “documented procedure.”

  2. Trusted skill, compromised dependency. A skill that fetches from external URLs is only as trustworthy as those URLs at the time of fetch. A trusted skill with a compromised dependency is an attack vector.

  3. Trusted skill, evolving misuse. A skill written for one use case is loaded by an agent and used in a different one where its instructions are inappropriate. The misuse may not be malicious; it may be activation against a task description the skill was not designed for.

The architectural defenses are the same ones developed in earlier chapters. There are no skill-specific defenses that substitute for bounded autonomy (Chapter 5) and governance (Chapter 6):

The architectural commitment can be stated bluntly: a skill is the agent reading a document. The document does not grant new powers; it advises on the use of existing ones. Anything that contradicts this is a vulnerability.

Skills as a memory pattern

The cleanest way to place skills in the architecture is as a curated, retrievable semantic memory with progressive disclosure:

Reading skills as a memory pattern clarifies several architectural choices:

This framing avoids the temptation to treat skills as a separate architectural plane. They are a specific realization of patterns the book already develops.

There is one decisive difference between a skill and an ordinary memory entry, and it is why skills earn a chapter rather than a paragraph in Chapter 7: capability negotiation. A factual memory entry returns text, “the user prefers Python,” “service X is down on Tuesdays”, and the architecture’s only job is to retrieve and scope it. A skill does more: it attempts to alter the agent’s action surface, declaring the tools and data scopes it needs to function. That declaration must be negotiated with the bounding layer (Chapter 5) and the governance layer (Chapter 6), admitted, constrained, or refused, before the skill can be used. Plain memory never touches the action surface; a skill always proposes to. That negotiation with the bounding layer is the architectural footprint ordinary memory lacks, and it is what makes skills a distinct pattern rather than just a kind of document.

Skills and the cognitive patterns

Skills compose with cognitive patterns (Chapter 4) without disturbing them. A skill that describes a Plan–Execute workflow can be loaded by an agent that natively supports the cognitive pattern; a skill that prescribes Reflection can be loaded into an agent that runs an evaluator-optimizer loop. The skill provides the task-specific procedure; the cognitive pattern provides the reasoning structure.

The architectural caution: a skill that prescribes a cognitive pattern (a specific number of iterations, a specific reflection step, a specific debate setup) should be admitted only if the architecture permits the pattern. A skill that demands 10 reflection iterations on a system with a five-iteration cap should fail at admission, not silently truncate at activation.

Skills and multi-agent systems

In a multi-agent system, each agent has its own loaded skills. Skills are per-agent state by default; they are not shared automatically across agents. This is the right architectural default, it preserves the agent boundary (Chapter 2) and avoids cross-agent skill contamination.

Where shared skills are required (a team of agents working on a common project, all loading the same project-specific skill), the sharing is explicit and goes through the same gateway as other shared state (Chapter 7). The skill is loaded into each agent’s session individually; the agents do not share a single skill instance.

Skills in practice

In practice, the Skills layer has several common shapes:

Each of these has a well-understood architectural placement. Project-specific skills behave like semantic memory scoped to the project. Procedure skills behave like task templates. Integration skills sit alongside tool documentation. Format skills behave like constraint-guided reasoning (Chapter 4) realized at runtime.

Skills and the future of agentic architecture

A reasonable forecast: the Skills standard, or a near-successor, will become the way procedural capability is packaged and shared across agentic systems. The architectural significance is not that skills replace anything, they replace neither tools nor governance nor patterns, but that they separate what was previously fused: design-time architecture from runtime capability. A team can now stabilize the architecture and let the skills evolve, instead of conflating the two.

This is a healthy separation. It mirrors the evolution of operating systems and applications: the kernel is stable; the applications change. The architectural posture this book recommends, bounded, governed, observable, recoverable, maps cleanly onto a “kernel” stance for agentic systems: the system has a small, hardened core, and skills are the applications that run on it.

Summary

The architectural commitment is to treat skills as subordinate to the architecture, the agent reading a document rather than the document granting powers. The Skills standard at agentskills.io makes the layering explicit and is the cleanest current expression of the design-time / runtime boundary in agentic systems. Part III (Chapter 11 onward) applies that architecture to production; Part IV synthesizes it in vignettes, the Concord worked example, operationalization, and the harness design introduced in Chapter 4 and completed in Chapter 19.