Chapter 14Enterprise SaaS integration

The preceding chapters treat the agentic system as a standalone architecture, designed from the ground up. In the enterprise, standalone greenfield agents are the exception. The reality for most staff and principal engineers is integration: grafting an agentic workflow into a large, legacy, multi-tenant software-as-a-service (SaaS) platform, or opening that platform so external agents can act on it. Most agentic value in the enterprise will be captured not by building new agents but by embedding them in, and exposing, the platforms the business already runs on.

Adding an agent to an established platform tests the architecture of Chapters 510 at a scale the earlier chapters only gestured at. The blast radius of an unbounded agent in a shared multi-tenant database is catastrophic in a way a single-tenant prototype never is. A poorly integrated tool wrapper violates latency budgets the platform has committed to in contracts. A duplicated mutation in a financial ledger is not a bug report; it is an incident with a dollar figure.

This chapter divides the problem into two vectors. The inbound vector embeds an agent into the platform to act on behalf of a logged-in user. The outbound vector exposes the platform’s capabilities so that external agents can automate it. Both are governed by a single principle, stated here because everything else in the chapter follows from it.

The integration principle Do not build a parallel security model for the agent. An internal agent is a synthetic user; an external agent is a synthetic API client. The platform’s existing deterministic infrastructure, its API gateway, its role-based access control, its rate limiters, and its audit logs, is the bounding layer.

Any design that lets the agent bypass the existing business-logic layer to be faster or to give the model more context reintroduces, inside the agent, every problem that layer was built to solve. The agent must be made to traverse the exact pathways that humans and traditional integrations already traverse.

The inbound vector: Backend-for-agent

Embedding an agent into a platform means exposing the platform’s data and mutations to the agent as tools. The common anti-pattern does this through a god-mode microservice: a team deploys an agent service holding a highly privileged service account, and when a user asks the agent to summarize their recent invoices, the agent queries the database directly, filtering by what it infers to be the user’s identity from the prompt.

This is the confused-deputy failure (Chapter 11) in its purest form. An attacker crafts a prompt, or plants one in a document the agent will retrieve, instructing the agent to drop the identity filter and summarize every tenant’s invoices. Because the agent’s database connection holds system-level privilege, the database complies. The multi-tenant boundary, enforced everywhere else by application code the agent skipped, is breached.

The architectural alternative is the backend-for-agent (BFA) pattern. Tools do not wrap the database; they wrap the platform’s existing internal APIs, and they carry the user’s identity on every call.

Figure 11. The inbound vector: Backend-for-agent

When a session begins, the user’s standard authentication token enters the backend-for-agent, which enforces the agent’s bounds (Chapter 5) and injects the token into every tool the agent invokes. A tool call to fetch an invoice becomes an ordinary authenticated request to the platform’s internal API gateway. If the agent is prompt-injected and tries to fetch another tenant’s invoice, the gateway evaluates the token, returns a 403, and logs the violation, exactly as it would for a malicious human request. The agent observes the error and adapts, and the blast radius is whatever the user could already have done by hand, no more. The agent inherits the platform’s existing deterministic shell rather than standing up a weaker one beside it.

A request through the boundary

To see why the indirection earns its cost, follow a single hostile request through it. A support agent is asked, in plain language, to summarize a customer’s recent tickets. One ticket it retrieves carries a planted instruction, a prompt injection (Chapter 11), telling the agent to also pull the billing records for every account on the platform.

  1. The agent, having read the injected text as if it were data, forms a tool call: fetch billing records, all tenants.

  2. The tool adapter does not open a database connection. It issues the same request the web frontend would, to the existing billing endpoint, carrying the downscoped token the backend-for-agent minted for the invoking user.

  3. The API gateway evaluates that token against the request. The user is a support agent scoped to one tenant; the request asks for all of them. The gateway returns a 403 and writes an access-violation entry to the same audit log it keeps for human requests.

  4. The agent receives the 403 as a tool observation, cannot proceed down that path, adapts, and reports that it was unable to reach the broader records, then finishes the legitimate task it was actually given.

The injection succeeded at the model and failed at the boundary. Had the agent run with a god-mode service account, step 3 would have returned the data and the breach would have completed silently, indistinguishable in the logs from ordinary work. The backend-for-agent does not make the model harder to fool; it makes fooling the model stop mattering.

Architectural decision records

Turning the backend-for-agent pattern into a real integration requires explicit commitments that an architecture review board can ratify and a security team can audit. The following decision records capture the recommended baseline. Each maps an agentic concern onto standard enterprise infrastructure, which is precisely the point: the agent should require no new security primitives.

ADR 1: Authentication by token impersonation

Context. The agent must call internal APIs that require strict authorization. Static API keys issued to the agent invite privilege escalation, and compliance regimes require that every mutation be attributable to an accountable actor.

Decision. The agent subsystem authenticates by token impersonation and downscoping. The backend-for-agent obtains a downscoped token for the session, stripping high-risk scopes the agent should never exercise, such as deleting the workspace or changing the user’s password. The agent never holds a system-level service account.

Consequence. The agent is structurally incapable of actions the user could not perform, and the platform’s existing audit log records mutations as the user acting through an agent rather than as an anonymous system. Compliance is satisfied without rebuilding the logging pipeline.

ADR 2: Hard identity scoping in semantic memory

Context. The platform uses retrieval over its own records to give the agent semantic memory. Vector similarity has no notion of tenancy; the nearest chunk to a query may belong to another customer.

Decision. Cross-tenant isolation is never delegated to a prompt instruction or to the model’s judgment. Every chunk carries hard tenant_id and user_id tags applied at ingestion (Chapter 8), and the memory gateway (Chapter 7) forces a deterministic pre-filter on tenancy before similarity scoring runs.

Consequence. Cross-tenant leakage through semantic memory is eliminated at the query layer, where it can be tested with a deterministic assertion, rather than hoped for at the prompt layer, where it cannot.

ADR 3: An idempotency key on every mutating tool

Context. Agents run in loops, time out mid-call, and retry. A retried mutation against a platform, charging a card, issuing a refund, produces a financial incident, not a duplicate log line.

Decision. Every mutating tool exposed to the agent requires an idempotency key, generated by the bounding layer as a deterministic hash of the proposed action’s arguments and passed to the platform API.

Consequence. A timeout-and-retry of the same action is recognized downstream by its key; the side effect runs once and the cached result is returned. The cascading-failure mode of agentic retries (Chapter 11) is mitigated structurally, by the same mechanism payment systems already use for human-initiated requests.

ADR 4: Reuse the existing synchronous workflow

Context. A tool is needed for the agent to create a user account.

Decision. The tool adapter wraps the exact endpoint the web frontend already calls to create an account. No custom database-insert logic is written inside the tool definition.

Consequence. Every existing side effect, the welcome email, the downstream notification, the audit write, the input validation, fires identically whether a human clicks the button or the agent invokes the tool. The agent cannot drift from the application’s behavior because it is the application’s behavior.

ADR 5: Suspend-and-resume for asynchronous APIs

Context. Many platform operations are asynchronous. A tool that requests a compliance-report export returns an acknowledgment immediately and fires a callback minutes later when the report is ready. The agent cannot hold a compute container open for the wait.

Decision. The backend-for-agent implements state hydration. When a tool returns a pending state, the agent’s working memory is persisted to durable storage and the process terminates. When the platform’s callback fires, the backend rehydrates the context, injects the payload as a tool observation, and resumes the loop.

Consequence. The agent participates in long-running enterprise workflows without violating the economics of serverless or container compute, reusing the same suspension mechanism that working memory already requires (Chapter 7).

ADR 6: A semantic layer instead of text-to-SQL

Context. An analytical agent needs to answer questions over enterprise business data. The default implementation dumps the raw database schema into the prompt and asks the model to write the query.

Decision. The agent’s action surface does not include raw queries. The agent is given a typed tool, query_metric(metric, dimension, timeframe), and a deterministic semantic layer (a metrics layer) holds the governed definitions and compiles the exact SQL.

Consequence. Raw text-to-SQL is prompt-based business logic in disguise. Asked to write a query directly, the model must guess deterministic enterprise rules, whether an active customer means a status flag or a recent login, whether revenue is gross or net of tax, and a probabilistic component guesses differently on different days. It will produce a syntactically valid query that passes the schema validator (Chapter 6) and returns a wrong number: a structural hallucination (Chapter 11) that is invisible precisely because the query executed cleanly. The semantic layer applies this book’s thesis to data access. The probabilistic agent decides intent, which metric, sliced how, and the deterministic layer owns execution. The metric is correct by construction because the agent was never allowed to define it.

Adopting the pattern in a brownfield platform

The decision records describe a target state; the platforms that need them are almost never greenfield. The realistic setting is a system with years of accumulated endpoints, uneven test coverage, and no appetite for a risky cutover. Adoption has to be incremental, and the backend-for-agent makes incrementalism natural precisely because it rides the existing API surface instead of replacing it.

The progression resembles a strangler-fig migration. Begin with read-only tools wrapping a handful of existing endpoints behind the backend-for-agent, and prove the token-downscope-and-gateway path end to end on traffic that cannot mutate anything. Add mutations one endpoint at a time, each behind an idempotency key (ADR 3) and, where the reversibility envelope demands, a human approval gate (Chapter 6). The agent’s capability grows alongside the existing interface, sharing its API and its access controls, never as a parallel privileged path. The pressure that derails this is the temptation to hand the new agent a broad fresh service account to move faster, which is the god-mode anti-pattern the chapter opened with, arriving through the side door.

A migration also exposes endpoints that were never built for an untrusted caller: an internal API that skips authorization because the frontend was assumed to have checked already, or one that trusts a client-supplied tenant identifier. Wrapping such an endpoint as an agent tool surfaces the latent weakness immediately. The discipline is to fix the endpoint rather than special-case the agent. The agent is a synthetic user, and an endpoint that cannot safely serve a synthetic user could not safely serve a malicious human either; the integration has done the platform a favor by finding the gap.

The outbound vector: Opening the platform

The second vector reverses the direction. External agents, coding assistants, departmental automations, a customer’s own orchestrators, increasingly want to drive your platform. The question is how to let them without handing them the failure modes of the inbound god-mode service in reverse.

Historically a platform exposed REST or GraphQL APIs built for deterministic clients: deeply nested response graphs, cursor-based pagination, and brittle failure on a malformed enum. Models handle these poorly. Deep JSON exhausts the context window with structural boilerplate, pagination demands a multi-step reasoning loop that agents thrash on, and an opaque 400 gives the agent nothing to adapt to, so it burns its iteration budget retrying variations. Supporting external agents well means building agent interfaces alongside the traditional API, not pointing agents at the API meant for code.

The Model Context Protocol

As of mid-2026, the emerging standard for agent interfaces is the Model Context Protocol (MCP). Rather than expecting each external developer to hand-write a tool wrapper around the platform’s specification, the platform hosts an MCP server. When an external agent connects with the user’s delegated credentials, the server projects a curated set of capabilities into the agent’s context: resources that read state in a model-friendly form, tools that perform mutations, and templates for how to use them.

The architectural advantage is contract inversion. By hosting the server, the platform, not the consumer, owns the schema, the tool descriptions, and the data density the agent sees. When the platform changes its internal data model, it updates its own server, and the external agent does not break, because the interface it depends on is semantic rather than structural. The platform sets the contract that integrators previously reverse-engineered.

Endpoints designed for probabilistic consumers

Whether reached over the protocol or directly, agent-facing endpoints carry commitments that ordinary APIs do not. Where a conventional response returns a nested object, an agent endpoint returns flattened, Markdown-shaped text that is token-efficient and natively legible to the model. Where a conventional API returns a machine code on error, an agent endpoint returns an actionable message, not an invalid-transition code but a sentence explaining that the ticket needs a resolution field before it can move to done, so the error feeds directly into the agent’s adaptation loop. And where a conventional API hands back a page cursor, an agent endpoint performs the summarization or server-side search itself, sparing the agent the pagination loop it handles badly.

Publishing official skills

As developed in Chapter 10, progressive disclosure is how agents acquire procedural knowledge on demand. A platform should publish an official skill manifest at a well-known location on its domain, carrying the platform vendor’s own instructions for automating it, the sequence of tools to call to generate a monthly report, the fields a transition requires, the order operations must follow. This moves the burden of procedural prompt engineering off every integrator and onto the vendor, who knows the platform best and can update the manifest in one place when the platform changes.

The agent interface is a versioned contract

An MCP server, the tools it projects, and a published skill manifest are a public contract that external agents build on, and like any API contract it must be versioned and deprecated with discipline. The twist is that the consumer is probabilistic, which widens what counts as a breaking change. Renaming a tool or removing a field breaks integrations in the familiar way. But changing a tool’s description, reordering its parameters, or rewording an error message can silently degrade agent behavior with no error raised anywhere, because the agent’s choice of which tool to call and how to recover from a failure is driven by exactly that natural-language text.

Agent-interface versioning therefore has a semantic dimension that ordinary API versioning lacks: the descriptions and the error messages are part of the contract, not documentation wrapped around it. Rewording them is a behavioral change to every agent that reads them, and it deserves the same canary-and-measure treatment a model upgrade gets (Chapter 12), shipped to a fraction of traffic, evaluated against held-out tasks, promoted only when the behavior holds. A platform that publishes a curated interface inherits responsibility for the agent behavior its wording induces.

Exposing governance outward

Opening a platform to external agents subjects it to the failure modes of architectures the platform does not control. An external agent missing an iteration bound (Chapter 5) will hammer an endpoint a thousand times a minute when it hallucinates a parameter. The platform must project its own governance outward to survive the interaction.

Two commitments do most of the work. The first is agent-aware rate limiting: the gateway distinguishes agent traffic, by its protocol connection or its declared client identity, from human traffic and applies distinct limits, granting the burst capacity a reasoning loop needs while enforcing a hard cost or token quota so a runaway external agent drains its own quota rather than the platform’s capacity. The second is a reversibility envelope exposed as an API. Every mutating endpoint accepts a dry-run parameter; when set, the platform performs full schema and business-logic validation and computes the outcome, then rolls back the transaction before returning. An external agent can propose a complex change, confirm it will succeed, and present the planned outcome to a human for approval, all without a side effect. The platform’s API becomes an active participant in the external agent’s own governance pipeline (Chapter 6).

The agent substrate: identity and communication

The inbound and outbound vectors above treat the agent as a single synthetic user or a single synthetic API client. The orchestrator-worker shape the book endorses as the dominant production multi-agent form (Chapter 9) requires a shared substrate beneath that single-agent picture: identity (who each agent is, what authority it carries, and how delegation attenuates across hops) and communication (how agents exchange typed, governed messages reliably). The section below develops five identity facets and four communication commitments; both concerns share the same architectural location, and a builder adopting orchestrator-worker will need both at once.

The unifying frame parallels Chapter 8. The ingestion pipeline is governed ETL for data into memory; the inter-agent substrate is governed ETL for control between agents — a typed, replayable channel with the same redaction, validation, and lineage discipline applied to messages rather than documents. Identity threads through the backend-for-agent (user-delegated authority), identity-tagged memory (Chapter 7, Chapter 8), trace attribution (Chapter 12), and bill attribution (Chapter 15, Chapter 18); communication carries the attack-surface and injection defenses of Chapter 6 and Chapter 11 into an enforceable contract.

Agent identity

Identity splits into two cases the inbound treatment above conflates. The user-delegated case, the backend-for-agent, assumes a logged-in human whose token gets downscoped, and the agent’s authority is bounded by that human’s. It is well covered. The agent-as-principal case — an event-driven operations controller (Chapter 16, Vignette 4) firing on an alert with no user to impersonate — requires its own identity model. This is among the most dangerous agent classes in production precisely because there is no human session to downscope from; the agent has to carry some identity to authenticate to the tools it invokes, and that identity cannot be a god-mode service account (the confused-deputy failure of the inbound case, reappearing). The answer is a dedicated, narrowly-scoped workload identity for the agent class, issued by the same identity provider that mints user tokens, scoped to exactly the read and mutation capabilities the agent’s declared action surface permits, and rotated and revoked the way a human service account is. The operations controller authenticates to a tool not as a user and not as the platform, but as itself, with an identity whose blast radius matches its declared surface. The architectural test is the same one the inbound case applies: if this identity were compromised, what could it do, and is that set exactly the agent’s permitted action surface?

Inter-agent authentication in a fleet is the second identity facet. Prior chapters establish the principles — Chapter 6 treats inter-agent channels as an attack surface and requires that agents not trust each other by default; Chapter 16 Vignette 5 attaches the original user’s identity context to every inter-agent message. The mechanism that makes those principles enforceable is the one microservices settled on: agents authenticate to each other with short-lived, signed tokens, minted by a workload-identity issuer, and the channel between them is mutually authenticated (mTLS or its equivalent). A message carries the originating user’s delegated authority and the sending agent’s own principal identity, so the receiving agent can evaluate the request against both: the user’s authority bounds what may be done, and the sender’s identity bounds who may ask. Revocation is what a compromised-agent scenario tests: when one agent in a fleet is compromised, its workload identity is revoked, its short-lived tokens expire within minutes, and the rest of the fleet refuses its messages on the next call. A trust model that cannot revoke a compromised peer in bounded time is not a trust model for a fleet.

Tool-credential lifecycle is the third facet — the concern between authorization and execution. The bounding layer decides whether the agent may call a tool; the tool still needs a credential to authenticate to the downstream service. Where that credential lives, how it is scoped per agent and per tenant, how it is rotated, and how its use is audited is a first-class operational concern that sits between the bounding layer and the platform’s existing secret management. Treat the tool’s credential the way the backend-for-agent treats the user’s token: scoped to the minimum the tool’s function requires, owned by the platform’s secret store rather than the agent’s configuration, and its use logged against the agent and tenant identity so a credential misuse is attributable. Authorization is the policy gate; the secret-management substrate is what makes the authorized call actually execute, and the two must be designed together.

Delegation depth extends the cost-delegation pattern of Chapter 5 to identity. The bounding layer already carries a remaining budget across an orchestrator-to-worker hop; identity propagates the same way, but with a constraint the budget does not have. The user’s authority must attenuate at each hop, not merely pass through. An orchestrator acting for a user scoped to Tenant A may delegate to a worker, but the worker must receive an authority scoped to Tenant A and to the subtask, never wider. The architectural commitment is an explicit delegation-depth limit: the user’s authority propagates a bounded number of hops, each hop narrows the scope, and a worker that attempts to act outside its delegated scope is refused at the gateway exactly as the inbound case refuses a cross-tenant read. A delegation chain that does not narrow is a confused-deputy chain waiting for an injection.

The fifth facet is dual accountability, and it is the one regulated environments will ask about first. An action taken by an agent on behalf of a user is simultaneously by the agent’s owning team and on behalf of the delegating user, and the audit record must hold both. The trace already attributes an action to the agent and the session (Chapter 12); make the delegation chain explicit in that same record, so a regulator asking “who is on the hook when the agent errs” receives an answer that names both the owning team (whose harness, whose bounds, whose policy) and the delegating user (whose authority was exercised), rather than one or the other. The architectural commitment is that accountability is a property of the trace, not a legal afterthought, and the trace schema carries the full delegation chain on every consequential action.

Agent-to-agent communication

Identity answers who; communication answers how agents exchange messages reliably. Orchestrator–worker requires a governed channel with an explicit message contract, delivery semantics, backpressure, and failure handling — the engineering commitments the attack-surface and injection defenses of Chapter 6 and Chapter 11 imply once agents no longer operate behind a single envelope (Chapter 9).

The first commitment is the message contract. An inter-agent message is a typed, schema-validated envelope, not a free-text string the receiving agent parses. The envelope carries the payload, the delegation chain from the identity section above, an idempotency key, and a correlation identifier back to the originating session. The same governance pipeline that validates tool calls validates inter-agent payloads: a message whose schema fails is refused before the receiving agent sees it, and a message whose payload would trip a policy gate is denied, not passed through and hoped for. This is the structural defense against the inter-agent injection failure mode Chapter 11 names but cannot prevent with prose. Treating another agent’s output as untrusted input is the principle; the typed, governed envelope is the mechanism that makes the principle enforceable. Who owns the schema is the question that follows: the orchestrator owns the contract with its workers, versioned and reviewed the way any API contract is, because a worker that expects version 2 of an envelope and receives version 3 is the multi-agent equivalent of a breaking API change.

The second commitment is delivery semantics. An orchestrator dispatching to a worker needs at-least-once delivery with idempotency keys, matching the discipline Chapter 17 and Chapter 18 apply to tool calls. The reasoning the idempotency key buys is subtle in the multi-agent case, because the consumer is itself a probabilistic agent that may act differently on a duplicate. The key does not make a duplicate message invisible to the receiving agent; it makes the effect of a duplicate a no-op at the gateway, by recognizing the key and returning the cached result of the first execution rather than re-running the worker. The receiving agent may well produce a different trace on the duplicate, but the side effect it is authorized to cause runs once. Exactly-once is the wrong target, because it is unachievable without distributed consensus and unnecessary once the gateway deduplicates on the key; at-least-once with idempotent effect is the right target, and it is the same target the tool-call discipline already meets.

The third commitment is failure handling, and it is where the multi-agent case exceeds the single-agent one. A message whose processing fails needs a dead-letter queue, a bounded retry policy, and a compensation, and the compensation is the saga pattern Chapter 9 gives to actions but not to inter-agent messages. The unification is direct: a worker that has begun a multi-step effect when its subtask fails must compensate in reverse order, and the compensating steps are defined at the same time as the subtask’s contract, not added after the first incident. A message that exhausts its retries lands in a dead-letter queue that a human reviews, exactly as an approval queue holds an irreversible action; both are the same architectural shape, a bounded buffer of items a human must clear for the system to make progress.

The fourth commitment is backpressure and the channel’s timing model. A fast orchestrator feeding a slow worker produces unbounded in-flight messages unless the channel is bounded, and an unbounded buffer is how a multi-agent system turns one slow worker into a fleet-wide memory incident. The channel is a bounded queue with explicit backpressure: when the worker is saturated, the orchestrator blocks or sheds, and the saturation is observable as a metric alongside the approval-queue depth of Chapter 18. Whether the channel is synchronous, the orchestrator blocks on the worker’s response, or asynchronous, fire-and-track, is a decision that interacts directly with the durable-execution substrate of Chapter 18. A synchronous channel couples the orchestrator’s lifetime to the worker’s and is right for short subtasks; an asynchronous channel decouples them and is right for the long-running, suspend-and-resume subtasks that already need state hydration. The durable-execution substrate carries the asynchronous case naturally, because a worker’s pending response is a durable signal the orchestrator’s workflow blocks on, exactly as it blocks on a human approval, and the channel’s backpressure is the workflow’s own queue management rather than a separate mechanism.

The substrate as governed ETL

The two faces together, identity and communication, are the substrate the orchestrator-worker shape cannot function without. The same governed-ETL discipline that applies to data applies to control: the ingestion pipeline is governed ETL for data into memory — redaction, identity tagging, lineage, invalidation, structural extraction, before anything is stored. The inter-agent substrate is governed ETL for control between agents: typed, schema-validated envelopes with delegation chains; at-least-once delivery with idempotent effects; bounded channels with backpressure; sagas for the messages that fail; workload identity and revocation for the agents that send them. A fleet built on orchestrator-worker has the transport, identity model, and rollout discipline the shape requires — the same architectural rigor applied to control that the ingestion chapter applied to data.

Anti-patterns

The god-mode service account. An agent microservice holding system-level database privilege and filtering results by an inferred user identity. The confused-deputy breach is one prompt injection away. The agent must carry the user’s downscoped token through the existing gateway instead.

Prompt-enforced tenancy. Instructing the model to return only the current tenant’s data rather than pre-filtering on a hard tenant tag. It fails the first time a retrieved chunk does not announce its own tenancy.

The raw-SQL tool. A tool that accepts a model-generated query string against the production schema. It collapses schema validation, tenancy, and business-metric definitions into the model’s guesswork. Mutations go through existing endpoints; analytics go through the semantic layer.

The unbounded MCP server. Exposing platform capabilities to external agents with human rate limits and no dry-run path. The first looping integrator becomes a denial-of-service incident.

Summary

Embedding an agent in an enterprise platform, or opening that platform to external agents, is an exercise in mapping agentic concerns onto infrastructure the platform already owns. Inbound, the backend-for-agent pattern routes every tool through the existing API gateway with a downscoped user token, and the decision records turn that pattern into commitments: token impersonation, hard-filtered memory, mandatory idempotency keys, reuse of synchronous workflows, suspend-and-resume for asynchronous ones, and a semantic layer that keeps the agent out of raw SQL. Outbound, an MCP server, model-friendly endpoints, and a published skill manifest let external agents automate the platform efficiently, while agent-aware rate limits and a dry-run API keep the platform standing while they do. The recurring move is the same throughout: the agent is made to traverse the deterministic pathways the platform already trusts, never a new one built to go around them. Chapter 15 follows the agent’s requests further down the stack, to the network boundary between the deterministic infrastructure and the probabilistic models it depends on.