Production Safety Policy Template

Lock down AI agents in production with strict model allowlists, rate limits, and approval requirements for sensitive operations. This policy ensures agents can only use approved models, stay within safe request rates, and require human approval before performing high-risk actions.

What this prevents

An AI agent deployed to a customer-facing support system was accidentally configured to use GPT-4-32K instead of GPT-4o-mini. The agent began processing support tickets with the expensive model, generating 200K-token completions for simple FAQ responses. In four hours, it burned through $800 and introduced 3-second response latencies that degraded the customer experience. A production safety policy would have blocked the expensive model entirely and capped per-request costs.

Policy template

Copy this into your govyn.yaml and adjust the values to match your requirements.

govyn.yaml
agents:
  prod_agent:
    models:
      allow: [gpt-4o-mini, claude-haiku-4-5-20251001]
      deny: [gpt-4-32k, claude-opus-4-20250514]
    rate_limit:
      requests_per_minute: 20
      concurrent: 3
    budget:
      daily: $10.00
    context:
      max_input_tokens: 16000
      max_output_tokens: 4000
    approval:
      require_for:
        - model: gpt-4o
        - estimated_cost_above: $0.50
    logging:
      replay: true
      retention_days: 90

How it works

1

Agent request arrives at the proxy

The production agent sends its LLM request to Govyn. The proxy identifies the agent by its API key and loads the production safety policy.

2

Model allowlist check

Govyn verifies the requested model is in the allow list. If the agent requests a denied model (like gpt-4-32k), the request is rejected immediately with a clear error.

3

Rate limit and concurrency check

Govyn checks whether the agent is within its requests-per-minute and concurrent request limits. Excess requests are queued or rejected to prevent overload.

4

Approval gate for high-risk requests

If the request matches an approval trigger (e.g. estimated cost above $0.50), Govyn holds the request and notifies a human approver. The request only proceeds after explicit approval.

5

Token limits enforced

Govyn validates that the input doesn't exceed the max_input_tokens limit, preventing accidentally expensive long-context calls.

Configuration options

Option Description Example
models.allow Whitelist of models this agent can use [gpt-4o-mini, claude-haiku-4-5-20251001]
models.deny Blacklist of models explicitly blocked [gpt-4-32k]
rate_limit.concurrent Maximum simultaneous in-flight requests 3
context.max_input_tokens Maximum input tokens per request 16000
approval.require_for Conditions that trigger human approval estimated_cost_above: $0.50

Add this policy to your config

Start Govyn with this policy in under 5 minutes. No code changes needed.

Get started

Related policy templates

Explore more