How We Made AI Response Caching Tamper-Resistant: Lessons from Defending Against Cache Poisoning

March 22, 2026 25 min read

caching security cache-poisoning data-integrity semantic-cache

Five defense layers that prevent cache poisoning in semantic AI caches: key hardening, args hash pre-filters, Zod-based response validation, granular invalidation, and observe mode for safe rollout.

The cached answer that came from the wrong model

A team runs two agents through the same proxy. One agent calls GPT-4o for complex multi-step reasoning. The other calls Claude Haiku for fast, cheap classification. Both agents use the same tool — search_docs — with the same query: “What is our refund policy?”

The Haiku agent runs first. Its response gets cached. Five minutes later, the GPT-4o agent sends the same query. The cache returns the Haiku response. The GPT-4o agent receives a terse, two-sentence classification answer when it expected a detailed analysis with reasoning steps. It passes this downstream as its own output.

The user sees a stripped-down answer that looks nothing like GPT-4o output. The telemetry says “cache hit.” The cost dashboard says “$0.00.” Everything looks efficient. The answer is wrong.

This is cache poisoning. Not the kind where an attacker deliberately injects malicious content — the kind where naive cache key design causes cross-contamination between legitimate requests. The cache was working exactly as designed. The design was the problem.

We built Govyn’s semantic cache in v1.0. It cut API costs by up to 73% for teams with repetitive workloads. In v1.2, we hardened it against five distinct attack vectors. This post documents what we found, what we built, and why every team running a semantic cache for AI responses needs to think about this.

The problem: why naive AI caching is dangerous

Traditional HTTP caching is well-understood. The cache key is the URL. The response is a static resource. Cache invalidation is hard, but cache correctness is simple: same URL, same response.

AI response caching breaks every assumption that makes traditional caching safe.

The responses are non-deterministic. The same prompt can produce different outputs. Two identical requests to GPT-4o may return different completions. Caching one and serving it for the other is a deliberate choice to sacrifice freshness for cost savings. That tradeoff is fine when you understand it. It becomes dangerous when the “same prompt” is not actually the same prompt.

The cache key space is ambiguous. What makes two AI requests “the same”? The model? The messages? The system prompt? The temperature? The tool definitions? The conversation history? Every field you omit from the cache key is a dimension along which two different requests can collide.

Semantic similarity adds a second attack surface. Exact-match caching is relatively safe — if the hash matches, the inputs matched. Semantic caching uses embedding similarity to match “close enough” requests. This is powerful for cost reduction but opens a new class of attacks: if you can craft a request that is semantically similar to a cached entry but structurally different, you can retrieve a response that was not generated for your request.

Multi-tenant isolation is non-obvious. In a proxy that serves multiple organizations, a cache poisoning vulnerability does not just affect one team. If cache keys do not include tenant identifiers at every layer, one org’s cached response could leak to another.

We identified three specific attack vectors in our v1.0 implementation and built five defenses to close them.

Attack vector 1: cross-model contamination

The scenario from the introduction is the simplest form of cache poisoning. Two requests with identical content but different model targets produce the same cache key. The response from Model A gets served for a request intended for Model B.

Our v1.0 extractHashInput() function computed the cache key from the last user message and the tool definitions:

// v1.0 -- VULNERABLE
function extractHashInput(body) {
  const messages = body.messages;
  const tools = body.tools || body.functions || [];
  for (let i = messages.length - 1; i >= 0; i--) {
    if (messages[i].role === "user") {
      return { lastUserContent: messages[i].content, tools };
    }
  }
  return { messages, tools };
}

The model name is not in the hash input. A GPT-4o request and a Claude Haiku request with the same last user message and the same tools produce the same cache key. The first response cached wins. Every subsequent request for any model gets the same response.

This is not a theoretical concern. Any team using smart model routing — where a proxy transparently routes requests to different models based on complexity — generates exactly this pattern. Short requests go to a mini model. Long requests go to a premium model. If both share a cache, cross-contamination is guaranteed.

Impact: Wrong model’s output served silently. No error. No warning. Quality degrades unpredictably. Downstream agents make decisions based on responses that came from a model with different capabilities.

Attack vector 2: context-stripping

Two conversations about the same topic, but with different histories, produce different responses. A customer support agent that has been discussing a billing dispute for ten turns will give a different answer to “What should I do next?” than an agent that just started a conversation.

Our v1.0 implementation extracted only the last user message for the cache key. The preceding conversation history — system prompts, assistant responses, user clarifications — was discarded.

Conversation A:
  system: "You are a billing support agent"
  user: "I was charged twice"
  assistant: "I see the duplicate charge. Let me process a refund."
  user: "What should I do next?"   <-- only this goes into cache key

Conversation B:
  system: "You are a fraud detection agent"
  user: "Flag suspicious transactions"
  assistant: "I'll analyze recent activity."
  user: "What should I do next?"   <-- same last message = same cache key

Both conversations produce the same cache key. The billing agent’s response gets served for the fraud detection query. The fraud agent receives “Please wait 3-5 business days for your refund to process” instead of a fraud analysis.

This vector is more subtle than cross-model contamination because it requires understanding why conversation history matters for correctness. A naive analysis might conclude that “What should I do next?” has the same semantic meaning in both contexts. It does not. The meaning is entirely dependent on what came before.

Impact: Responses served out of context. Especially dangerous in multi-turn conversations where the meaning of the last message depends entirely on prior context. Agents receive instructions that are contextually wrong but syntactically plausible.

Attack vector 3: similarity manipulation

This is the vector unique to semantic caching. Exact-match caching is immune — if the hash does not match byte-for-byte, there is no hit. Semantic caching uses embedding vectors to find “close enough” matches. If two requests have a cosine similarity above the threshold (typically 0.92-0.98), the cached response is served.

An attacker — or an innocent user with a slightly different tool call — can craft a request that is semantically similar to a cached entry but structurally different:

// Cached request (legit):
{ "tool": "transfer_funds", "args": { "to": "alice", "amount": 100 } }

// Attacker's request:
{ "tool": "transfer_funds", "args": { "to": "eve", "amount": 100 } }

The tool name is identical. The argument structure is identical. The embedding of “transfer 100 to alice” is close to “transfer 100 to eve” because the semantic content is similar — both describe a fund transfer of the same amount. If the similarity threshold is not strict enough, the cached response for Alice’s transfer gets served for Eve’s transfer.

In the context of AI agents that execute tool calls, this means one agent could receive a cached confirmation of a tool invocation that was never executed with its specific arguments. The downstream effect depends on what the agent does with that response, but the worst case is that the agent believes an action was completed when it was not, or believes it was completed with different parameters.

Impact: Semantic similarity matching returns responses for requests with different structured arguments. Especially dangerous for tool calls where argument values (not just structure) determine correctness.

Cache poisoning attack vectors

Defense 1: cache key composition — model name and full message history

The fix for attack vectors 1 and 2 is straightforward. Include the model name and the full message history in the cache key hash input.

// v1.2 -- HARDENED
function extractHashInput(body: Record<string, unknown>): Record<string, unknown> {
  const model = typeof body["model"] === "string" ? body["model"] : "";
  const messages = Array.isArray(body["messages"]) ? body["messages"] : [];
  const tools = body["tools"] ?? body["functions"] ?? [];

  return { model, messages, tools };
}

Three changes from v1.0:

Model name included. model is now part of the hash input. A GPT-4o request and a Claude Haiku request with identical messages produce different cache keys. Cross-model contamination eliminated.
Full message history included. The entire messages array goes into the hash, not just the last user message. Two conversations with different histories but the same last message produce different cache keys. Context-stripping eliminated.
No lastUserContent extraction. The v1.0 loop that walked backward through messages looking for the last user message is removed entirely. No partial extraction. The full conversation is the cache key.

The cache key format stays the same: cache:{orgId}:{sha256hex}. The SHA-256 input changes. Every field that affects the LLM’s response is now part of the hash.

Tradeoff acknowledged: Including the full message history reduces exact-match hit rates for multi-turn conversations. Two conversations that diverge by a single system prompt message will never produce an exact cache hit. This is intentional. The semantic search path compensates — requests that are similar but not identical can still match through embedding similarity, subject to the args hash pre-filter described below.

This change is deployed as a clean slate. Old cache entries computed with the v1.0 hash formula will never match the v1.2 hash for the same request. We flush all caches on deploy. No dual-read. No backward compatibility. The old entries are dead weight.

Both the Cloudflare Worker (plain JavaScript, Web Crypto API) and the Express proxy (Node.js, node:crypto) implementations are updated in lockstep. Both must produce identical cache keys for identical inputs. Divergence between the two codebases would create its own class of cache bugs — entries written by one path invisible to the other.

Defense 2: args hash pre-filter for semantic similarity

Defense 1 hardens the exact-match cache path. Defense 2 hardens the semantic search path against attack vector 3 (similarity manipulation).

The idea: before comparing embedding similarity, require an exact match on a hash of the tool arguments. This filters out semantically similar requests that have different structured arguments.

// Compute a truncated SHA-256 of the tool arguments
async function computeArgsHash(args) {
  const canonical = stableStringify(args);
  const encoder = new TextEncoder();
  const hashBuffer = await crypto.subtle.digest("SHA-256", encoder.encode(canonical));
  const hashArray = Array.from(new Uint8Array(hashBuffer));
  return hashArray.map(b => b.toString(16).padStart(2, "0")).join("").slice(0, 16);
}

The args hash is stored as Vectorize metadata alongside each cached vector:

// When indexing a new cache entry in Vectorize
metadata: {
  toolName: toolName || "unknown",
  argsHash: argsHash,        // NEW in v1.2
  policyId: policyId,
  cacheKey: cacheKey,
  expiresAt: Date.now() + (ttlSeconds * 1000),
  createdAt: Date.now(),
}

On semantic search, the args hash is added to the Vectorize filter:

// When searching for semantic matches
const results = await env.GOVYN_VECTORS.query(embeddingVector, {
  topK: 3,
  namespace: orgId,
  filter: {
    toolName: toolName,
    argsHash: argsHash,       // NEW in v1.2
  },
  returnMetadata: "all",
});

This is a pre-filter, not a replacement for similarity matching. The query still computes cosine similarity. But candidates that have a different args hash are excluded before similarity is evaluated. “Transfer 100 to Alice” and “transfer 100 to Eve” have different args hashes. Even if their embeddings are 0.99 similar, the Vectorize query will never return one as a match for the other.

Why a hash and not the raw arguments? Three reasons:

Vectorize metadata values have size limits. A hash is 16 characters. Tool arguments can be arbitrarily large JSON objects.
Metadata filtering is equality-based. Vectorize filters support exact match, not substring or partial match. A hash is the right data type for equality comparison.
Canonical serialization is already solved. stableStringify() produces deterministic JSON regardless of key ordering. The hash of canonicalized arguments is deterministic across requests.

The two-gate architecture: Every semantic cache lookup now passes through two gates. Gate 1: Vectorize metadata filter (toolName + argsHash). Gate 2: embedding cosine similarity threshold. A cache hit requires passing both. This makes similarity manipulation attacks require matching both the exact tool arguments AND the semantic content — which means the requests are genuinely identical, not just similar.

Defense 3: response validation with self-healing eviction

Defenses 1 and 2 prevent incorrect cache writes. Defense 3 handles the case where a corrupted or malformed response already exists in the cache. This could happen through a bug, a partial write, a storage corruption, or a deliberate tampering attempt.

Before v1.2, the cache read path was: fetch entry, parse JSON, serve to client. If the cached JSON was malformed — missing required fields, wrong structure, truncated — the client received garbage. No validation. No fallback.

v1.2 adds Zod-based structural validation on every cache hit, before serving:

import { z } from "zod";

const openAiResponseSchema = z.object({
  id: z.string(),
  object: z.string(),
  model: z.string(),
  choices: z.array(z.object({
    index: z.number(),
    message: z.object({
      role: z.string(),
      content: z.union([z.string(), z.null()]),
    }).passthrough(),
    finish_reason: z.union([z.string(), z.null()]),
  }).passthrough()).min(1),
}).passthrough();

const anthropicResponseSchema = z.object({
  id: z.string(),
  type: z.literal("message"),
  role: z.string(),
  model: z.string(),
  content: z.array(z.object({
    type: z.string(),
  }).passthrough()),
  stop_reason: z.union([z.string(), z.null()]),
}).passthrough();

export function validateCachedResponse(
  responseBody: string,
  format: ApiFormat,
): { valid: true } | { valid: false; error: string } {
  try {
    const parsed = JSON.parse(responseBody);
    const schema = format === "openai"
      ? openAiResponseSchema
      : anthropicResponseSchema;
    schema.parse(parsed);
    return { valid: true };
  } catch (e) {
    const message = e instanceof Error ? e.message : "Unknown validation error";
    return { valid: false, error: message };
  }
}

Design choices

.passthrough() on every schema. Provider APIs evolve. Responses include optional fields, extensions, beta features. A strict schema that rejects unknown fields would break every time OpenAI or Anthropic adds a new response field. The validation checks structural correctness — required fields exist, types are correct, arrays are non-empty — and ignores everything else.

content: z.union([z.string(), z.null()]) for OpenAI. Tool call responses have null content and a tool_calls array instead. Rejecting null content would invalidate legitimate cached tool call responses.

finish_reason: z.union([z.string(), z.null()]) for both providers. Some streaming responses have null finish reasons in intermediate chunks that get cached.

Self-healing behavior

When validation fails, the cache does not return an error to the client. It silently evicts the corrupted entry, logs the event at high severity, and falls through to the upstream provider as a normal cache miss:

if (cached) {
  const validation = validateCachedResponse(cached.responseBody, cached.format);
  if (!validation.valid) {
    // Self-healing: evict, log, treat as cache miss
    console.error(
      `[CACHE-SECURITY] Corrupted cache entry evicted: ` +
      `key=${cacheKey} format=${cached.format} error=${validation.error}`
    );
    await cacheStore.delete(cacheKey);
    cacheStatus = "MISS";
    // Fall through to upstream -- user gets a fresh response
  } else {
    // Serve cached response normally
    res.status(200).json(JSON.parse(cached.responseBody));
    return;
  }
}

The end user never sees a cache validation error. They get a slightly slower response (cache miss instead of hit) and a fresh result from the provider. The operator gets a [CACHE-SECURITY] log entry that indicates potential cache poisoning, which can trigger alerting.

Cloudflare Worker implementation note: The Worker runs in the Cloudflare runtime and cannot use Zod (no npm dependencies). The Worker’s validateCachedResponse uses equivalent manual type guards:

function validateCachedResponse(responseBody, format) {
  try {
    const parsed = JSON.parse(responseBody);
    if (typeof parsed !== "object" || parsed === null) {
      return { valid: false, error: "Response is not an object" };
    }
    if (format === "openai") {
      if (typeof parsed.id !== "string") return { valid: false, error: "Missing id" };
      if (!Array.isArray(parsed.choices) || parsed.choices.length === 0) {
        return { valid: false, error: "Missing or empty choices" };
      }
      if (!parsed.choices[0].message || typeof parsed.choices[0].message !== "object") {
        return { valid: false, error: "Missing message in first choice" };
      }
    } else {
      if (typeof parsed.id !== "string") return { valid: false, error: "Missing id" };
      if (parsed.type !== "message") return { valid: false, error: "Invalid type" };
      if (!Array.isArray(parsed.content)) return { valid: false, error: "Missing content" };
    }
    return { valid: true };
  } catch (e) {
    return { valid: false, error: e.message || "JSON parse error" };
  }
}

Both implementations validate the same structural envelope. Both evict on failure. Both fall through to upstream. The user experience is identical regardless of whether the request came through the Express proxy or the Cloudflare Worker.

Cache validation flow

Defense 4: granular cache invalidation

Defenses 1-3 are preventive. Defense 4 is reactive: when something goes wrong, operators need the ability to surgically remove compromised entries without flushing the entire cache.

v1.2 adds four invalidation endpoints:

Endpoint	Scope	Use Case
`DELETE /api/v1/cache`	Entire org	Incident response, deploy-time flush
`DELETE /api/v1/cache/keys/:cacheKey`	Single entry	Known poisoned entry
`DELETE /api/v1/cache/tools/:toolName`	All entries for a tool	Compromised tool definition
`DELETE /api/v1/cache/agents/:agentId`	All entries for an agent	Agent-specific contamination

All endpoints require MANAGER role. Zod validates path parameters with length limits.

Cascade architecture

Cache entries exist in three stores: Prisma (PostgreSQL), Cloudflare KV (key-value), and Cloudflare Vectorize (vector embeddings). Invalidation must cascade to all three. Deleting the Prisma entry but leaving the KV entry means the Worker still serves the old response. Deleting KV but leaving the vector means semantic search still returns the old cache key.

The cascade works through a Worker internal endpoint:

API (cache.service.ts)
  |
  |--> Prisma deleteMany   (immediate)
  |
  |--> POST /internal/flush-cache to Worker
         |
         |--> KV list(prefix) + delete()
         |--> Vectorize deleteByIds()

The Worker endpoint is secured with a shared internal secret (X-Govyn-Internal header). The API calls it after completing the Prisma deletion. The Worker handles KV list-and-delete with cursor pagination (KV returns max 1,000 keys per page) and Vectorize bulk deletion by ID.

The cascade is fire-and-forget from the API’s perspective. If the Worker flush fails (network issue, Worker downtime), the Prisma entries are still deleted. KV entries have TTL and will expire naturally. The next cache read for a deleted Prisma entry will be a miss, even if the KV entry still exists, because the Express proxy checks Prisma first.

Deploy-time flush

When the cache key formula changes (as it did in v1.2 with the model + full messages update), all existing cache entries are invalid. Old entries will never match the new hash for the same request — they are dead weight consuming storage.

A deploy-time flush script truncates all cache entries:

npx tsx scripts/flush-cache-on-deploy.ts

This deletes all cache_entries rows in Prisma across all organizations. KV entries expire via TTL. Vectorize entries become unreachable under the new key formula. Clean slate, as designed.

Dashboard integration

The dashboard Settings page gets a “Flush Org Cache” button with a confirmation dialog. One click, all cached responses for the organization are deleted. No per-key or per-tool UI — the API supports granular invalidation for programmatic use, but the dashboard provides the panic button for admins who need to flush everything immediately.

Defense 5: observe mode for safe cache rollout

The scariest moment in deploying a new cache policy is the first hour. Did the similarity threshold catch edge cases? Is the args hash pre-filter working? Are there response validation false positives evicting legitimate entries?

Observe mode lets you deploy a cache policy in production without enforcing it. The cache policy is evaluated — similarity is computed, thresholds are checked, validation runs — but the result is logged, not enforced. Every request goes to upstream as if caching were disabled. The telemetry records what would have happened.

The implementation lives in the control plane’s authorize response:

// In the authorize response body
{
  decision: "allow",
  observeMode: true,
  observeUntil: "2026-03-29T00:00:00.000Z",  // auto-expires
  // ... other fields
}

When observeMode is true:

Block decisions log as OBSERVED instead of BLOCKED. The request passes through.
Queue decisions (approval workflows) log as OBSERVED. No approval item is created. The request passes through.
Cache hits are logged but the response still comes from upstream. You can compare cached vs fresh responses in your telemetry.

Observe mode has an optional expiry (observeUntil). Set it to a date 24-48 hours in the future. If you forget to turn it off, it turns itself off. Expired observe mode is treated as inactive, even if the observeMode flag is still true on the organization record.

const isObserving = org.observeMode === true &&
  (org.observeUntil == null || new Date(org.observeUntil) > new Date());

The Worker reads these fields from the authorize response and skips policy enforcement when observe mode is active. Since the control plane returns a 200 “allow” decision regardless of the actual policy evaluation result, the Worker treats observed requests as normal allowed requests. No special Worker-side logic is needed beyond reading the flag for telemetry tagging.

Deployment workflow:

Enable observe mode on the organization
Deploy the new cache policy
Monitor telemetry for OBSERVED entries for 24-48 hours
Check for unexpected cache validation failures (false positives)
Compare cache hit rates and response quality
Disable observe mode to activate enforcement

This workflow applies to any policy change, not just caching. Observe mode is a general-purpose safety net for rolling out governance changes in production.

Request flow through the cache layers

Here is how a request flows through the v1.2 cache system, from agent to response:

Agent sends request to proxy
  |
  v
[1] Authorize (control plane)
  |--> Check observe mode (active? bypass block/queue)
  |--> Return cache config (policy, TTL, similarity threshold)
  |
  v
[2] Compute cache key
  |--> extractHashInput: { model, messages, tools }
  |--> SHA-256 hash of stableStringify(hashInput)
  |--> Key format: cache:{orgId}:{sha256hex}
  |
  v
[3] Exact match lookup (KV or Prisma)
  |--> Key found?
  |     YES --> [4] Validate response (Zod / type guards)
  |              |--> Valid? Serve cached response (HIT)
  |              |--> Invalid? Evict entry, log [CACHE-SECURITY], continue to [5]
  |     NO  --> [5]
  |
  v
[5] Semantic similarity search (Vectorize)
  |--> Compute args hash of tool arguments
  |--> Compute embedding vector of request content
  |--> Query Vectorize: filter by toolName + argsHash, rank by cosine similarity
  |--> Match above threshold?
  |     YES --> [6] Validate response (same as step 4)
  |              |--> Valid? Serve cached response (SEMANTIC_HIT)
  |              |--> Invalid? Evict, log, continue to [7]
  |     NO  --> [7]
  |
  v
[7] Forward to upstream LLM provider
  |--> Get fresh response
  |--> Cache the response (write to KV/Prisma + index vector in Vectorize)
  |--> Serve fresh response (MISS)

Every cache hit — exact or semantic — passes through response validation before reaching the client. Every semantic match passes through the args hash pre-filter before similarity comparison. Every cache key includes the model name and full conversation history. The system is defense-in-depth: each layer catches what the previous layer missed.

Before and after

Property	v1.0	v1.2
Cache key input	Last user message + tools	Model + full messages + tools
Cross-model isolation	None	Model name in hash prevents contamination
Conversation context	Last message only	Full history prevents context-stripping
Semantic pre-filter	Tool name only	Tool name + args hash (SHA-256)
Response validation	None	Zod schemas (Express) / type guards (Worker)
Corrupted entry handling	Served to client	Self-healing eviction, transparent cache miss
Cache invalidation	None	By key, tool, agent, or full org flush
Invalidation cascade	N/A	Prisma + KV + Vectorize in one operation
Safe rollout mechanism	None	Observe mode with auto-expiry
Security logging	None	`[CACHE-SECURITY]` events for corrupted entries

Key takeaways

AI cache keys must include every field that affects the response. Model name, full conversation history, tool definitions. Omitting any dimension creates a collision vector. The cost of a slightly larger hash input is negligible compared to the cost of serving wrong responses.
Semantic caching needs a structural pre-filter. Embedding similarity alone is not sufficient for correctness. The args hash pre-filter ensures that semantically similar requests with different structured arguments never collide. Two gates (exact args match + semantic similarity) are stronger than one.
Validate before serving, always. If the cached entry is corrupted, serve fresh from upstream. The user sees a cache miss, not an error. Self-healing eviction means the corrupted entry is cleaned up automatically. One bad entry causes one cache miss, not a persistent failure.
Invalidation must cascade across all storage layers. A semantic cache with multiple storage backends (database, key-value, vector store) must delete from all of them atomically. Deleting from one but not the others creates ghost entries that serve stale or poisoned responses.
Observe mode before enforce mode. Deploy any cache policy change in observe mode first. Monitor telemetry. Check for false positives. Then enforce. This is not optional caution — it is the difference between a smooth rollout and a production incident.

FAQ

Does the args hash pre-filter reduce semantic cache hit rates?

Yes, intentionally. The pre-filter requires that the tool arguments hash exactly before semantic similarity is even evaluated. This means two requests with the same tool but different argument values will never match, even if their embeddings are nearly identical. The hit rate reduction is the security guarantee. You are trading some cache efficiency for correctness — specifically, you are ensuring that a cached response for transfer_funds(to: "alice") is never served for transfer_funds(to: "eve").

Can an attacker still poison the cache if they control an agent?

If the attacker controls an agent with valid credentials, they can write entries to the cache through normal usage. The defenses in v1.2 prevent those entries from being served for requests with different models, different conversation contexts, or different tool arguments. The attacker’s cached entries only match requests that are structurally identical to what the attacker sent. At that point, the “poisoned” response is indistinguishable from a legitimate cached response for the same request, because it is one. The granular invalidation API allows operators to flush entries by agent if an agent is compromised.

How does observe mode interact with the proxy vs SDK architecture?

Observe mode is enforced at the proxy layer, not the SDK layer. The proxy’s control plane evaluates the policy and decides whether to enforce or observe. The agent never knows observe mode exists. It sends a request, gets a response. Whether that response came from cache, from upstream, or was evaluated-but-not-cached in observe mode is transparent to the agent. This is another advantage of the proxy architecture: governance changes (including cache policy changes) do not require agent modifications, restarts, or library upgrades.

What happens if the Vectorize metadata index for argsHash is not created?

If the Vectorize index does not include argsHash as a filterable field, the filter parameter is silently ignored. Semantic search returns matches based only on toolName and cosine similarity, which is the v1.0 behavior. This is a degraded-security state, not a failure. The exact-match cache path still uses the hardened key (model + full messages), so the primary defense is intact. The semantic path reverts to toolName-only filtering until the metadata index is created. This is documented in our deployment checklist.

Can I use these defenses without the rest of Govyn?

The cache key hardening pattern (include model + full messages) and the response validation pattern (Zod schema before serving) are applicable to any AI response cache regardless of architecture. The args hash pre-filter requires a vector database that supports metadata filtering (Cloudflare Vectorize, Pinecone, Weaviate). Observe mode requires a policy evaluation layer with telemetry. If you are building your own cache, apply patterns 1 and 3 immediately. They are the highest-value, lowest-effort defenses.

Free guide

The Token Diet: the cost wins a hardened cache is protecting

A poisoned cache silently erases the savings caching is supposed to deliver. The Token Diet is a free, no-email-required field guide to the AI cost wins worth defending in the first place, written by the same author. When you want the full prefix-caching workflow for a specific provider, the paid playbooks go deeper: Claude Edition and ChatGPT / Codex Edition.

Get The Token Diet (free) →

Govyn is an open-source API proxy for AI agent governance. Tamper-resistant caching ships in v1.2. MIT licensed. Self-host or cloud-hosted.

Start caching safely →