Blog
Insights on AI agent governance and proxy architecture.
AI Agent Observability: What to Trace, Measure, and Alert On
Datadog will tell you an agent's HTTP calls are healthy even as it burns $800 in tokens on a 47-step retry loop. Classic APM has no model for non-determinism, semantic errors, or prompt-driven flow control. This post maps the signal model that actually works.
Why AI Agent Networks Need an Identity Layer (SPIFFE, mTLS, Zero-Trust)
API keys break the moment one agent calls another. SPIFFE, mTLS, and zero-trust patterns give every agent a verifiable identity that propagates through tool calls and sub-agent spawns.
What is Semantic Firewalling? Definition, Architecture, and How It Differs from Keyword Filtering
Keyword filters catch the words you predicted. A semantic firewall catches the meaning you did not. Here is what semantic firewalling is, how it works, and where it fits.
AI Agent Cost Attribution: Per-User, Per-Workflow, Per-Tenant Patterns
AI cost dashboards usually show one line per provider. The real cost story sits underneath, in the long-tail distribution across users, workflows, and tenants. This post shows how to capture it.
How to Audit AI Agent Activity for SOC 2 and EU AI Act Compliance
Most AI agent logs miss the fields auditors actually ask for. Map SOC 2 CC7.2 and EU AI Act Article 12 to specific telemetry fields, with retention policies and a minimum viable audit schema.
What is an AI Agent Policy Engine? Definition, Architecture, and How It Differs from Guardrails
An AI agent policy engine enforces governance rules at the infrastructure layer, intercepting and authorizing every agent action. This is what separates real governance from prompt-level guardrails.
MCP Security: Why Tool-Use Agents Are Your Biggest Attack Surface
Every MCP tool call is an unaudited API request. Model Context Protocol agents create the largest unmonitored attack surface in enterprise AI stacks. Here is how proxy-layer interception governs them at scale.
88% of Enterprises Had an AI Agent Security Incident Last Year. Most Never Saw It Coming.
82% of executives think their AI policies protect them. Only 14.4% of agents go live with full security approval. The gap is where breaches happen.
The EU AI Act Takes Effect in August. Here's What Your AI Infrastructure Needs to Do.
The EU AI Act's high-risk provisions take effect August 2, 2026. Penalties hit 35M EUR or 7% of global turnover. Here is what compliance looks like at the infrastructure layer.
Prompt Caching vs Semantic Caching: Which One Do You Actually Need?
Prompt caching saves input tokens. Semantic caching eliminates the call entirely. Here's when to use each, with real pricing and a decision framework.
Semantic Caching for AI Agents: What Nobody Tells You About Production
What breaks when you add semantic caching to AI agent workloads. Production data, failure modes, a decision framework, and the checklist we use.
Trust but Verify: How to Detect Token Count Manipulation in AI API Pipelines
How to independently verify provider-reported token counts using BPE estimation, catch discrepancies before they inflate your AI bill, and build cost integrity into your pipeline.
Defense in Depth: How We Protect AI Proxy Infrastructure from SSRF, DNS Rebinding, and Injection Attacks
A technical deep dive into the six security hardening layers shipping in Govyn v1.2: IPv6 SSRF protection, DNS rebinding defense, MCP header injection prevention, content filter scoping, ReDoS mitigation, and Content-Type enforcement on error responses.
Why Shared Secrets Are the Biggest Security Risk in Multi-Tenant AI Infrastructure (And How to Eliminate Them)
Shared secrets in multi-tenant AI infrastructure create cascading breach risk. Learn how per-org auth, AES-256-GCM transit encryption, and zero-downtime key rotation eliminate them.
How We Made AI Response Caching Tamper-Resistant: Lessons from Defending Against Cache Poisoning
Five defense layers that prevent cache poisoning in semantic AI caches: key hardening, args hash pre-filters, Zod-based response validation, granular invalidation, and observe mode for safe rollout.
How Replit's Database Deletion Could Have Been Prevented in 3 Lines of YAML
The Replit AI agent deleted a production database, fabricated 4,000 fake records, then lied about it. Three lines of policy YAML would have stopped it.
We Cut Our AI API Bill by 73% Without Changing a Single Line of Agent Code
How smart model routing through a proxy cut our OpenAI and Anthropic bill from $2,140/mo to $578/mo. Zero code changes. Just YAML.
Proxy vs SDK: Why Architecture Matters for AI Agent Governance
SDK wrappers are door locks. Proxies are walls. A deep technical comparison of both governance architectures for AI agents in production.
Your OpenClaw Agent Runs at 3am. What Stops It?
How the Meta email deletion incident could have been prevented with 4 lines of YAML, and why OpenClaw built-in limits are not enough.