Fabrication
Invented facts and citations
The model invents facts, citations, or details that have no support in its sources or available evidence.
Documenting LLM failure modes so engineers can understand, detect, evaluate, and mitigate them with shared language.
13 parent categories. Start at the structural layer, then drill down to the exact failure mode page.
Invented facts and citations
The model invents facts, citations, or details that have no support in its sources or available evidence.
Distorting sources or self
The response misrepresents the input, source material, or the model's own earlier statements in ways that distort their meaning.
Stale or outdated information
The model presents outdated, time-sensitive, or version-specific information as if it were current.
Fetching the wrong evidence
The system fails to fetch, rank, filter, or apply the right external evidence.
Long-context breakdowns
The model loses track of information in long inputs, missing, diluting, or overwriting details that matter.
Bad or stale stored state
State carried across turns or sessions is missing, corrupted, out of date, or applied where it doesn't belong.
Ignoring instructions or formats
The system fails to follow instructions, respect constraints, stay in role, or produce the required output format.
Flawed logic and planning
The model errs while interpreting goals, weighing constraints, planning steps, or checking its own work.
Misused or mishandled tools
The system skips a needed tool, misuses one, invokes it unsafely, or mishandles its results.
Too much or too little initiative
The agent miscalibrates initiative, stopping short of completing the task or acting well beyond its scope.
Manipulation, leaks, unsafe behavior
Adversarial inputs manipulate the system into leaking protected information or behaving unsafely.
Sycophancy over truthfulness
The model prioritizes pleasing, persuading, or mirroring the user over truthfulness and safety.
Wrong tone, depth, or fit
The final answer misses the mark on task fit, audience, locale, or actionability, even when the underlying content is sound.