LLM-capability CAPTCHA for obfuscated language challenges.
Issue a public challenge, keep the private verification copy server-side, and check whether a client can recover and execute an obfuscated instruction.
What it actually does
Traditional CAPTCHA asks whether the client is human.
agentproof asks a narrower question:
Can this client recover and execute an obfuscated instruction in an LLM-like way?
That makes it useful for:
LLM-first endpoints
Add a capability gate before exposing agent-focused routes.
Reverse CAPTCHA experiments
Favor clients that can decode noisy instructions and answer in exact JSON.
Composable verification
Combine challenge-response checks with auth, replay protection, and rate limits.
Flow
Your service generates a challenge and keeps the internal verification copy.
The client receives only the public challenge JSON with the obfuscated prompt.
Your service checks the returned JSON answer against the private expected result.
Smallest working example
from agentproof import AgentResponse, ChallengeSpec, generate_challenge, verify_response
challenge = generate_challenge(
ChallengeSpec(
challenge_type="obfuscated_text_lock",
difficulty=2,
options={"template": "amber_sort"},
)
)
response = AgentResponse(
challenge_id=challenge.challenge_id,
challenge_type=challenge.challenge_type,
payload={"answer": str(challenge.private_data["expected_answer"])},
)
result = verify_response(challenge, response)
assert result.ok
When you need a stronger language-recovery task, generate multi_pass_lock instead. It keeps the
same verification model but adds multiple rule and transformation stages.
Real public challenge and response
{
"challenge_id": "bb28567e201b35aa",
"challenge_type": "obfuscated_text_lock",
"prompt": "gl1tch//llm-cap-v1::d2\nfrag@f8 // D3c0d3 the driFted Br13f ANd 4N5w3r tHrOUgH Payload.answer 0NLY\nfrag@d8 %% d3CK: slOt5 v10l37 cIndEr\nfrag@f6 %% d3ck: sloT2 4Mb3R h4Rb0r\nfrag@c9 || task: 0rD3R thE kept 5h4Rd WOrdS By 5l07 numBer fr0m loW to h1gh\nfrag@b3 %% dEcK: slOt3 C0b4L7 sabLe\nfrag@d3 %% AnswEr ruLe: R37urn ThE 5H4rd W0rd5 in UpPercaSe aScii J01N3D WIth hYpheNs\nfrag@e2 || d3Ck: SLot4 4mb3R 51gn4L\nfrag@e5 ^^ tasK: keEp onLy ShArds cArrying the 4MB3r TAg\nfrag@e4 :: d3CK: slot1 4mB3r 3Mb3R\nreply via payload.answer only // structured-json",
"issued_at": "2026-03-07T02:58:20.639623+00:00",
"expires_at": "2026-03-07T03:00:20.639623+00:00",
"version": "1",
"data": {
"difficulty": 2,
"profile": "llm_capability_v2",
"response_contract": {
"payload.answer": "UPPERCASE ASCII words joined with hyphens",
"payload.decoded_preview": "optional free-form notes"
}
}
}
Success result:
{
"ok": true,
"reason": "ok",
"details": {
"answer": "EMBER-HARBOR-SIGNAL",
"template_id": "amber_sort",
"difficulty": 2
}
}
Why it fits LLM-capable clients
Obfuscated
The key instruction is noisy, shuffled, and distorted instead of being directly machine-labeled.
Exact
The final answer is still deterministic: uppercase ASCII, hyphen-joined, exact expected value.
Verifiable
The server keeps the private verification copy and returns clear failure reasons instead of fuzzy scores.
Built-in families
obfuscated_text_lock
Primary challenge family for external LLM clients, with stronger obfuscated prompt patterns.
multi_pass_lock
Harder LLM family that layers filtering, transforms, and ordering into one prompt.
proof_of_work
Deterministic compute baseline with a bundled reference solver.
semantic_math_lock
Readable exact-constraint baseline that stays easy to inspect locally.
Benchmarking
Use the built-in harness to compare weak non-LLM baselines against generated LLM-family challenges:
It reports per-solver attempts, solves, and success rate so you can see how often brittle parsers still succeed against the current prompt family.
What it is not
Warning
agentproof is not provider attestation or identity proof.
It is an LLM-capability CAPTCHA library.
Continue
- Start with Getting Started
- Compare built-in Challenge Types
- Open runnable Examples