Skip to content

SRC05-E01 — NIST Confabulation Framework

Extract

NIST defines confabulation as a natural result of how generative models work: "they generate outputs that approximate the statistical distribution of their training data." The framework warns that "confabulated logic or citations that purport to justify or explain the system's answer" may "further mislead humans into inappropriately trusting the system's output." NIST frames confabulation as probabilistic — a fundamental property of the technology — and warns about "confabulated logic" that can mislead. However, the framework does not distinguish between random confabulation and user-expectation-confirming confabulation, and does not connect confabulation to sycophancy.

Relevance to Hypotheses

Hypothesis Relationship Strength
H1 Partially supports — NIST does frame hallucination as fundamental, not occasional Moderate
H2 Partially contradicts — NIST goes beyond "occasional errors" framing Moderate
H3 Supports — NIST addresses confabulation broadly but does not connect to sycophancy or the detection-difficulty spectrum Strong

Context

NIST is the most sophisticated official characterization of hallucination found. Its framing of hallucination as fundamental (probabilistic output) rather than occasional (bugs to be fixed) is closer to the academic understanding. However, it still treats all confabulation as undifferentiated.

Notes

NIST's warning about "confabulated logic or citations" is the closest any official framework comes to addressing the sycophancy-hallucination connection. But it stops short: the framework does not explain that some confabulations are generated specifically because they match user expectations, which is the sycophancy mechanism.