SRC05-E01 — NIST Confabulation Framework¶
Extract¶
NIST defines confabulation as a natural result of how generative models work: "they generate outputs that approximate the statistical distribution of their training data." The framework warns that "confabulated logic or citations that purport to justify or explain the system's answer" may "further mislead humans into inappropriately trusting the system's output." NIST frames confabulation as probabilistic — a fundamental property of the technology — and warns about "confabulated logic" that can mislead. However, the framework does not distinguish between random confabulation and user-expectation-confirming confabulation, and does not connect confabulation to sycophancy.
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Partially supports — NIST does frame hallucination as fundamental, not occasional | Moderate |
| H2 | Partially contradicts — NIST goes beyond "occasional errors" framing | Moderate |
| H3 | Supports — NIST addresses confabulation broadly but does not connect to sycophancy or the detection-difficulty spectrum | Strong |
Context¶
NIST is the most sophisticated official characterization of hallucination found. Its framing of hallucination as fundamental (probabilistic output) rather than occasional (bugs to be fixed) is closer to the academic understanding. However, it still treats all confabulation as undifferentiated.
Notes¶
NIST's warning about "confabulated logic or citations" is the closest any official framework comes to addressing the sycophancy-hallucination connection. But it stops short: the framework does not explain that some confabulations are generated specifically because they match user expectations, which is the sycophancy mechanism.