Blog

Insights on AI agent governance and proxy architecture.

June 30, 2026 38 min read

AI Agent Observability: Metrics, Traces, and Logs That Actually Matter

Most AI agent monitoring captures the wrong things. Here is what to instrument across metrics, traces, and logs, and the signals that actually tell you something.

observabilitymonitoringagent-governanceproxyopentelemetry

June 23, 2026 35 min read

Multi-Provider AI: Failover, Routing, and Cost Control

A single LLM provider is a single point of failure and a single price you cannot negotiate. Running multiple providers buys resilience and cost control, if you handle the routing right.

multi-providerfailovercost-reductionroutingproxy

June 16, 2026 32 min read

Prompt Firewalling: App, Gateway, or Model? A Decision Matrix

Three places to put injection detection: in the app, in the gateway, or behind the model. Each layer trades latency, coverage, and false positives differently. Here is how to pick.

prompt-injectionai-gatewaysemantic-firewallsecurityarchitecture

June 16, 2026 38 min read

Prompt Injection Defense at the Proxy Layer: A Practical Taxonomy

Prompt injection is not one attack, it is a family. This is a practical taxonomy of injection types and the proxy-layer defenses that actually stop each one.

securityprompt-injectionagent-governanceproxytaxonomy

June 9, 2026 34 min read

What is an AI Gateway? Definition, Architecture, and How It Differs from an API Gateway

An AI gateway is a control layer that sits between applications and LLM providers, adding routing, caching, cost control, and security that a traditional API gateway was never built for.

ai-gatewayarchitectureagent-governanceproxyinfrastructure

June 2, 2026 25 min read

AI Agent Observability: What to Trace, Measure, and Alert On

Datadog will tell you an agent's HTTP calls are healthy even as it burns $800 in tokens on a 47-step retry loop. Classic APM has no model for non-determinism, semantic errors, or prompt-driven flow control. This post maps the signal model that actually works.

agent-governanceobservabilitymonitoringOpenTelemetryAI agentsSRE

June 2, 2026 31 min read

The 2026 Mid-Year State of AI Agent Security

Six months of agent security data: incident catalog by category, severity distribution, defense maturity by org type, and projections for H2 2026.

securityagent-securityannual-reportagent-governanceincidents

May 26, 2026 45 min read

Why AI Agent Networks Need an Identity Layer (SPIFFE, mTLS, Zero-Trust)

API keys break the moment one agent calls another. SPIFFE, mTLS, and zero-trust patterns give every agent a verifiable identity that propagates through tool calls and sub-agent spawns.

agent-identityspiffemtlszero-trustagent-governance

May 19, 2026 40 min read

What is Semantic Firewalling? Definition, Architecture, and How It Differs from Keyword Filtering

Keyword filters catch the words you predicted. A semantic firewall catches the meaning you did not. Here is what semantic firewalling is, how it works, and where it fits.

securitysemantic-firewallagent-governanceproxycontent-filtering

May 12, 2026 33 min read

AI Agent Cost Attribution: Per-User, Per-Workflow, Per-Tenant Patterns

AI cost dashboards usually show one line per provider. The real cost story sits underneath, in the long-tail distribution across users, workflows, and tenants. This post shows how to capture it.

cost-attributionfinopsagent-governanceobservabilityproxy

May 5, 2026 36 min read

How to Audit AI Agent Activity for SOC 2 and EU AI Act Compliance

Most AI agent logs miss the fields auditors actually ask for. Map SOC 2 CC7.2 and EU AI Act Article 12 to specific telemetry fields, with retention policies and a minimum viable audit schema.

agent-governanceauditcomplianceSOC 2EU AI Actobservability

April 28, 2026 32 min read

What is an AI Agent Policy Engine? Definition, Architecture, and How It Differs from Guardrails

An AI agent policy engine enforces governance rules at the infrastructure layer, intercepting and authorizing every agent action. This is what separates real governance from prompt-level guardrails.

agent-governancepolicy-engineguardrailsarchitectureproxy

April 22, 2026 29 min read

MCP Security: Why Tool-Use Agents Are Your Biggest Attack Surface

Every MCP tool call is an unaudited API request. Model Context Protocol agents create the largest unmonitored attack surface in enterprise AI stacks. Here is how proxy-layer interception governs them at scale.

securityMCPagent-governancetool-useproxy

April 17, 2026 19 min read

88% of Enterprises Had an AI Agent Security Incident Last Year. Most Never Saw It Coming.

82% of executives think their AI policies protect them. Only 14.4% of agents go live with full security approval. The gap is where breaches happen.

shadow-aiagent-securityenterpriseproxygovernanceincident-responseCISO

April 15, 2026 19 min read

The EU AI Act Takes Effect in August. Here's What Your AI Infrastructure Needs to Do.

The EU AI Act's high-risk provisions take effect August 2, 2026. Penalties hit 35M EUR or 7% of global turnover. Here is what compliance looks like at the infrastructure layer.

EU AI Actcompliancegovernanceproxyaudit-loggingenterpriseregulation

April 10, 2026 16 min read

Prompt Caching vs Semantic Caching: Which One Do You Actually Need?

Prompt caching saves input tokens. Semantic caching eliminates the call entirely. Here's when to use each, with real pricing and a decision framework.

prompt-cachingsemantic-cachingcost-reductionllm-cachingai-agents

April 8, 2026 20 min read

Semantic Caching for AI Agents: What Nobody Tells You About Production

What breaks when you add semantic caching to AI agent workloads. Production data, failure modes, a decision framework, and the checklist we use.

semantic-cachingai-agentscost-reductionllm-cachingproduction

March 22, 2026 25 min read

Trust but Verify: How to Detect Token Count Manipulation in AI API Pipelines

How to independently verify provider-reported token counts using BPE estimation, catch discrepancies before they inflate your AI bill, and build cost integrity into your pipeline.

finopstoken-countingcost-integrityobservabilityai-operations

March 22, 2026 20 min read

Why Shared Secrets Are the Biggest Security Risk in Multi-Tenant AI Infrastructure (And How to Eliminate Them)

Shared secrets in multi-tenant AI infrastructure create cascading breach risk. Learn how per-org auth, AES-256-GCM transit encryption, and zero-downtime key rotation eliminate them.

securitymulti-tenantencryptionkey-managementzero-trust

March 22, 2026 23 min read

Defense in Depth: How We Protect AI Proxy Infrastructure from SSRF, DNS Rebinding, and Injection Attacks

A technical deep dive into the six security hardening layers shipping in Govyn v1.2: IPv6 SSRF protection, DNS rebinding defense, MCP header injection prevention, content filter scoping, ReDoS mitigation, and Content-Type enforcement on error responses.

ssrfsecurityinjectiondns-rebindingdefense-in-depthcontent-filter

March 22, 2026 25 min read

How We Made AI Response Caching Tamper-Resistant: Lessons from Defending Against Cache Poisoning

Five defense layers that prevent cache poisoning in semantic AI caches: key hardening, args hash pre-filters, Zod-based response validation, granular invalidation, and observe mode for safe rollout.

cachingsecuritycache-poisoningdata-integritysemantic-cache

March 4, 2026 10 min read

We Cut Our AI API Bill by 73% Without Changing a Single Line of Agent Code

How smart model routing through a proxy cut our OpenAI and Anthropic bill from $2,140/mo to $578/mo. Zero code changes. Just YAML.

cost-reductionmodel-routingproxyopenaianthropic

March 4, 2026 11 min read

Proxy vs SDK: Why Architecture Matters for AI Agent Governance

SDK wrappers are door locks. Proxies are walls. A deep technical comparison of both governance architectures for AI agents in production.

proxysdkarchitectureagent-governancesecurity

March 4, 2026 9 min read

How Replit's Database Deletion Could Have Been Prevented in 3 Lines of YAML

The Replit AI agent deleted a production database, fabricated 4,000 fake records, then lied about it. Three lines of policy YAML would have stopped it.

ai-agent-safetyproduction-safetyproxyincident-analysisgovernance

February 15, 2026 13 min read

Your OpenClaw Agent Runs at 3am. What Stops It?

How the Meta email deletion incident could have been prevented with 4 lines of YAML, and why OpenClaw built-in limits are not enough.

agent-governanceopenclawproxycost-controlsecurity