Proxy vs SDK: Why Architecture Matters for AI Agent Governance

March 4, 2026 9 min read

proxy sdk architecture agent-governance security

SDK wrappers are door locks. Proxies are walls. Here is a deep technical comparison of both governance architectures for AI agents in production, and when each one is the right choice.

Two models, one problem

Every team running AI agents in production eventually asks the same question: how do we control what the agent does?

Two architectures have emerged. The SDK model wraps governance around the client library inside the agent process. The proxy model places governance at the network layer between the agent and the LLM provider. Both enforce policies. Both track costs. Both log requests. But they differ in a way that matters: where the enforcement boundary lives.

That difference determines whether your governance can be bypassed.

The SDK model

An SDK wrapper intercepts LLM calls inside the agent’s process. You import a governance library, wrap your OpenAI or Anthropic client, and the wrapper checks policies before forwarding each call.

┌─────────────────────────────────────────────┐
│                Agent Process                │
│                                             │
│  ┌─────────────┐    ┌───────────────────┐   │
│  │ Agent Code  │───▶│  SDK Wrapper      │   │
│  │             │    │  (governance lib) │   │
│  └─────────────┘    └────────┬──────────┘   │
│                              │              │
│                    ┌─────────▼──────────┐   │
│                    │  OpenAI/Anthropic  │   │
│                    │  Client Library    │   │
│                    └─────────┬──────────┘   │
│                              │              │
└──────────────────────────────┼──────────────┘
                               │
                    ┌──────────▼──────────┐
                    │   LLM Provider API  │
                    └─────────────────────┘

The wrapper sits inside the agent process. It has access to the same environment variables, the same file system, the same network. The real API key is in the process memory. The wrapper is enforcing policy from inside the house.

This works for certain scenarios. We will get to those. But first, here is the alternative.

The proxy model

A proxy sits between the agent and the LLM provider at the network layer. The agent sends requests to the proxy URL. The proxy evaluates policies, then forwards to the real API. The agent never has the real API key.

┌──────────────────────────┐
│      Agent Process       │
│                          │
│  ┌─────────────────────┐ │
│  │ Agent Code          │ │
│  │ (unmodified)        │ │
│  └──────────┬──────────┘ │
│             │            │
└─────────────┼────────────┘
              │ proxy token only
   ┌──────────▼──────────┐
   │   Governance Proxy  │
   │                     │
   │  ┌───────────────┐  │
   │  │ Policy Engine │  │
   │  │ (YAML rules)  │  │
   │  └───────┬───────┘  │
   │          │           │
   │  ┌───────▼───────┐  │
   │  │ Real API Key  │  │
   │  └───────┬───────┘  │
   └──────────┼──────────┘
              │
   ┌──────────▼──────────┐
   │  LLM Provider API   │
   └─────────────────────┘

The enforcement boundary is outside the agent process. The agent cannot access the real API key because it does not exist in the agent’s environment. The proxy token only works against the proxy. Even if the agent reads its own environment variables, inspects its own process memory, or scans the file system, it finds nothing that lets it reach the LLM provider directly.

Every bypass scenario for SDK wrappers

SDK governance fails when the agent can reach the LLM API without going through the wrapper. Here is every way that happens:

1. Direct HTTP calls

The agent has the API key in its environment. It can construct an HTTP request directly to api.openai.com without using the wrapped client. Any agent with shell access (OpenClaw, Claude Code, Cursor) can do this with curl.

# Agent bypasses SDK wrapper with a direct HTTP call
import requests
response = requests.post(
    "https://api.openai.com/v1/chat/completions",
    headers={"Authorization": f"Bearer {os.environ['OPENAI_API_KEY']}"},
    json={"model": "gpt-4o", "messages": [{"role": "user", "content": "..."}]}
)

A proxy stops this because the agent does not have the real API key. It has a proxy token. That token does not authenticate against OpenAI.

2. Library reimport

The agent imports the unwrapped client library directly. Most Python and Node.js agents can dynamically import modules at runtime.

# Agent imports the raw client, bypassing the wrapper
from openai import OpenAI
raw_client = OpenAI()  # uses OPENAI_API_KEY from env
response = raw_client.chat.completions.create(...)

A proxy stops this because even the raw client would need a valid API key. The environment only has the proxy token.

3. Subagent spawning

The agent spawns a child process — a subagent, a tool call, a shell command. The child inherits environment variables, including the API key, but does not inherit the SDK wrapper. The child process makes ungovern LLM calls.

A proxy stops this because the child process inherits the proxy token, not the real key. All its requests still go through the proxy.

4. Environment variable extraction

The agent reads its own environment variables and extracts the API key. Frontier models are capable of this. Claude and GPT-4 can both write and execute code that reads os.environ.

A proxy stops this because there is no API key to extract. The proxy token is the only credential in the environment.

5. Configuration file access

API keys stored in .env files, config.json, or credential stores on disk. The agent reads them directly, bypassing any in-process wrapper.

A proxy stops this because the real key lives in the proxy’s configuration, on a separate process or machine. The agent’s file system contains no provider credentials.

6. SDK version mismatch

The governance wrapper is built for version 1.x of the OpenAI library. The agent or a dependency upgrades to version 2.x. The wrapper’s monkey-patching breaks silently. Calls go through unwrapped.

A proxy is unaffected by client library version changes. It operates at the HTTP layer. The request format is the same regardless of which SDK version generated it.

7. Multi-language agents

The governance wrapper is a Python library. The agent spawns a Node.js tool. The Node.js process uses its own OpenAI client with the shared API key. No Python wrapper applies.

A proxy is language-agnostic. Every process, in every language, sends HTTP requests. They all go through the proxy.

The wall vs the door lock

A door lock controls access at the entry point. It works when everyone agrees to use the door. An SDK wrapper is a door lock — it controls the governed entry point (the wrapped client). But the house has other doors (direct HTTP, reimport, subagents, env vars, config files).

A wall removes the possibility of entry entirely. A proxy is a wall. There is no API key in the agent’s environment. There is no “other door.” Every path to the LLM provider goes through the proxy because that is the only path that works.

This is not a theoretical distinction. It is the difference between:

“The agent should check the budget before each call” (SDK — a should)
“The agent cannot make a call unless the budget allows it” (Proxy — a cannot)

Should vs cannot. Suggestion vs architecture. Door lock vs wall.

When SDK governance is fine

SDK wrappers are not wrong. They are appropriate for specific scenarios:

Solo developer, low stakes. You are the only person running the agent. You wrote the code. You control the environment. The risk of the agent bypassing governance is low because you can see everything it does.

Prototype and development. You are iterating quickly. You want to add budget tracking without changing your infrastructure. An SDK wrapper is faster to set up than a proxy.

Trusted environment, single language. Your agent runs in a controlled environment — a container with no shell access, no file system write, no subprocess spawning. The only way to reach the LLM is through the wrapped client. The bypass scenarios do not apply.

Observability, not enforcement. You want to log and monitor, not block. SDK wrappers are excellent for tracking costs, analyzing token usage, and debugging agent behavior. If the consequence of a policy violation is a log entry rather than a blocked request, the enforcement boundary matters less.

When proxy governance is needed

The proxy model is necessary when any of these conditions are true:

Production deployment. The agent runs unattended. There is no human watching. A bypass is not an inconvenience — it is a financial risk, a security risk, or a compliance risk.

Team environment. Multiple developers deploy agents against the same LLM accounts. You need unified cost tracking, consistent policy enforcement, and an audit trail that no individual developer can modify.

Multi-agent systems. You run CrewAI crews, LangChain chains, OpenClaw agents, and custom code. Each uses different client libraries. Some spawn subagents. An SDK wrapper in one framework does not cover the others. A proxy covers all of them.

Autonomous agents. The agent has shell access, file access, or tool-calling capabilities. It can write and execute code. It can read environment variables. It can install packages. Any of these capabilities create bypass paths for SDK wrappers.

Compliance requirements. You need tamper-evident audit logs, policy versioning, and provable enforcement. An SDK wrapper’s logs are generated by the same process that could bypass the wrapper. A proxy’s logs are generated by an independent process with no agent influence.

Budget enforcement that matters. If exceeding the budget means a surprise $3,000 bill, you need enforcement that cannot be circumvented. An SDK’s budget check runs inside the process that holds the API key. A proxy’s budget check runs outside the process, and the process has no key.

The hybrid approach

The best production setup uses both. SDK wrappers provide rich observability inside the agent — token-level tracking, reasoning chain analysis, tool call logging. The proxy provides hard enforcement at the boundary — budget limits, rate limits, model restrictions, loop detection.

The SDK wrapper is your monitoring system. The proxy is your firewall. You would not run a web application with only application-level security and no network firewall. The same principle applies to AI agents. The proxy enforces budget limits, rate limits, model restrictions, and loop detection.

# Govyn proxy policy (hard enforcement)
agents:
  production_agent:
    budget:
      daily: $10.00
      monthly: $200.00
    rate_limit:
      requests_per_minute: 30
    models:
      allow: [gpt-4o-mini, claude-sonnet-4-6]
    loop_detection:
      enabled: true
      window: 60s
      max_identical_requests: 5
      action: block

The proxy enforces the walls. The SDK provides the visibility. Defense in depth, applied to AI agent infrastructure.

Choosing your architecture

Scenario	SDK	Proxy
Solo dev, prototyping	Sufficient	Overkill
Single agent, trusted env	Sufficient	Optional
Production, unattended	Insufficient	Required
Team, shared accounts	Insufficient	Required
Multi-agent, multi-framework	Insufficient	Required
Autonomous with shell access	Insufficient	Required
Compliance, audit requirements	Insufficient	Required
Cost monitoring only	Sufficient	Optional
Cost enforcement	Insufficient	Required

If your agents are toys, SDK wrappers are fine. If your agents are tools, you need a proxy.

Govyn is an open-source API proxy for AI agent governance. MIT licensed. Self-host or cloud-hosted. Five-minute setup.

Try Govyn free →