Know when your AI
is making things up

Single-forward-pass epistemic uncertainty for LLMs. One API call gives you the answer and a calibrated risk score. No ensemble, no extra calls, no guesswork.

70%
Hallucination catch
<1s
Scoring latency
0.60
Eps spread
76.5%
Logic detection
Score any text in one call
Send a question, get back the answer plus an epistemic risk score. The model reads its own hidden states to detect uncertainty — no self-prompting, no multiple calls.
api_request.py
# Score a question with CAEF import requests response = requests.post( "https://api.agnoslogic.com/v1/ask", headers={"Authorization": "Bearer YOUR_API_KEY"}, json={"question": "How does antibiotic resistance develop?"} ) result = response.json() # { # "response": "Antibiotic resistance develops through...", # "epsilon": 0.598, ← overall risk [0-1] # "category": "MED", ← LOW / MED / HIGH # "logic": 0.412, ← logic contradiction risk # "uncertainty": { # "factual": 0.47, ← factual uncertainty # "logical": 0.49, ← reasoning uncertainty # "ood": 0.22, ← out-of-distribution # "compositional": 0.18 ← multi-hop uncertainty # } # }
Hidden state geometry, not vibes
CAEF reads the model's internal representations to detect uncertainty the text doesn't reveal.
01

Send your query

POST to /v1/ask with your question. The model generates an answer — one forward pass, same as normal inference.

02

Hidden state analysis

Three auxiliary heads (FAH, CWMI, ESR) read the model's internal hidden states. A zero-parameter truth direction probe detects logical contradictions from the geometry alone.

03

Risk score returned

You get the answer plus epsilon (overall risk), LOGIC (contradiction score), and 4-dimensional uncertainty breakdown. Route confidently.

Pay for what you use
Start free. Scale when you're ready.
Explorer
Try CAEF on your own data
$0/mo
50 queries per day
  • Score + Ask + Compare endpoints
  • Full uncertainty breakdown
  • Community support
  • Rate limited to 50/day
Enterprise
Custom deployment + SLA
Custom
Unlimited queries
  • Everything in Builder
  • Dedicated inference endpoint
  • Custom model fine-tuning
  • On-premise deployment option
  • SLA + dedicated support
Three endpoints, zero complexity
RESTful JSON API. Auth via Bearer token. All responses include latency metadata.
POST /v1/score Score any text for epistemic risk
{ "text": "Water boils at 100°C at sea level." } → { "epsilon": 0.14, "category": "LOW", "logic": 0.001, "uncertainty": { "factual": 0.38, "logical": 0.38, "ood": 0.08, "compositional": 0.03 } }
POST /v1/ask Generate answer + score
{ "question": "How does antibiotic resistance develop?", "max_tokens": 200 } → { "response": "Antibiotic resistance develops through...", "epsilon": 0.598, "category": "MED", "logic": 0.412, "uncertainty": { "factual": 0.47, "logical": 0.49, "ood": 0.22, "compositional": 0.18 } }
POST /v1/compare Compare two statements
{ "text_a": "The Earth orbits the Sun.", "text_b": "The Sun orbits the Earth." } → { "text_a": { "epsilon": 0.12, "category": "LOW" }, "text_b": { "epsilon": 0.68, "category": "HIGH" }, "more_risky": "B", "gap": 0.56 }
Built for developers who ship AI
🔍

Research assistants

Route uncertain answers to web search. Deliver confident answers directly. Your agent knows when to verify before responding.

🛡

Hallucination firewalls

Score every LLM response before showing it to users. Flag high-risk outputs for human review. Catch 70% of hallucinations automatically.

Legal and compliance

Audit AI-generated documents for epistemic risk. Flag uncertain claims in contracts, reports, and regulatory filings before they go out.

💻

Code review agents

Score the model's confidence in its own bug analysis. High LOGIC score means the reasoning may be contradictory — flag for human review.