Skip to content

Threat Model

What agentproof helps with

  • adding a structured LLM-capability challenge before access to an API
  • requiring clients to recover intent from obfuscated text
  • raising the cost of brittle parser-style solvers with stronger prompt families
  • enforcing machine-readable responses
  • keeping private verification data on the server while exposing only the public challenge

What it does not prove

agentproof does not prove:

  • which model produced the response
  • whether the caller is using a specific provider
  • whether the caller used hardware-backed execution
  • whether the caller is malicious but well-automated

Correct way to think about it

agentproof answers:

Can this client recover and execute an obfuscated instruction and return the exact expected result?

It does not answer:

Is this definitely a trusted AI agent?

Deployment advice

Use agentproof as one signal in a broader system. In production, combine it with:

  • rate limiting
  • application authentication
  • server-side challenge storage or signed challenge state
  • expiration checks
  • replay protection
  • logging and abuse monitoring

Use the benchmark harness to measure one narrow question: how often simple non-LLM baseline solvers still pass your current public prompt family. Treat that as regression data, not as a full security evaluation.

Why the verification is strict

The library prefers exact constraints over fuzzy scoring so that:

  • the server can explain failures clearly
  • tests stay deterministic
  • challenge behavior is stable across environments

Why the public and private payload split matters

For the LLM families:

  • the public challenge should travel to the client
  • the private expected answer should not
  • verification should happen against the original in-memory challenge or its internal JSON form