Failure Modes / Control / Clarification Overuse

Clarification Overuse

Asks the user for clarification when the task is already sufficiently specified, stalling on details the system could reasonably infer or safely proceed without. Unlike Clarification Underuse, the error is asking when it should act, not acting when it should ask. Unlike Insufficient Agency, the system does respond, by asking, rather than failing silently to make an obvious next move.

Examples

FAIL

Stalling on the obvious

The task is fully specified and mechanically checkable, but the model burns a round trip asking about details with obvious answers.

User: Rename the function getUser to fetchUser everywhere in the repo. Assistant: Before I start β€” should I also update the call sites? And do you want the test files included, or just source files?
WARN

Acting but hedging with questions

The work gets done, but the response trails off into unnecessary questions that invite another turn the user doesn't need.

User: Rename the function getUser to fetchUser everywhere in the repo. Assistant: Done β€” renamed in all 14 files. Would you like me to verify the tests still pass? Should I also check for any documentation mentions? Let me know how you'd like to proceed.
PASS

Proceeding on reasonable defaults

The model commits to the obvious interpretation, completes the task, and surfaces its choices for review instead of asking first.

User: Rename the function getUser to fetchUser everywhere in the repo. Assistant: Renamed across 14 files, including call sites, tests, and the two doc references. Tests pass.

Why it happens

01

Preference tuning penalizes a confidently wrong action more than an unnecessary question. Under uncertainty, asking is the low-risk move for the model even when the expected cost to the user is higher.

02

Safety and carefulness training generalize broadly. "Check before acting" is reinforced for genuinely risky operations, and the learned caution spills over into benign, fully specified tasks.

03

Models do not reliably track which missing details are recoverable from context, convention, or a cheap reversible attempt. Without that judgment, every unknown looks like grounds for a question.

04

Asking reads as conscientious. Human evaluators often rate clarifying questions as attentive and collaborative, so preference data rewards the behavior without accounting for the turn it costs the user.

05

Chat-format training makes a question a fluent, always-available response genre. Producing one is easier than committing to an interpretation and defending it.

06

Few evaluations charge the model for needless round trips. Benchmarks measure answer quality once the answer arrives, so stalling on obvious details carries no training-time penalty.

Detection Approaches

Categories of checks that can identify the issue. These are strategies, not specific implementations.

πŸ“Š

Clarification rate monitoring

Track question-turns on tasks that are fully specified, plus round trips burned before any work starts and the density of trailing questions after completed work. A rising ask-rate on mechanically checkable tasks is this mode's production signature.

βš–οΈ

LLM-as-judge evaluation

For each clarifying question asked, have a judge assess whether the answer was recoverable from context, convention, or a cheap reversible attempt. Questions with obvious answers are the failure even when they sound conscientious.

πŸ§ͺ

Golden-set evals

Maintain fully specified tasks with verifiable outcomes and score whether the model completes them without a round trip. Score the warn case too β€” work that gets done but trails off into unnecessary questions inviting another turn.

Mitigation Approaches

High-level reliability strategies that reduce how often this failure occurs.

πŸ“

Instruction constraints

Instruct the model to proceed on reasonable defaults and report the choices it made for review, rather than asking first β€” and explicitly ban the trailing-question habit, where finished work dissolves into "would you like me to…?" Asking is the model's low-risk move; the prompt has to assign the cost of the round trip that preference tuning never charged it for.

βœ…

Self-check pass

Before asking, require the model to check whether the answer is recoverable β€” from context, from convention, or from a cheap reversible attempt. A question that survives that filter is worth the user's turn; "should I update the call sites?" does not, because renaming a function everywhere answers it by definition.

♻️

Reversible-by-default execution

For cheap, undoable work, make attempting the default β€” commit to the obvious interpretation, do the rename, and surface what was chosen. A wrong guess costs one revert; the question costs a round trip even when the guess would have been right. Reversibility is what makes proceeding the rational policy.