SRC08-E01 — NIST Framework Risk Categories¶
Extract¶
NIST AI 600-1 defines "confabulation" as "erroneous or false content in response to prompts" including "outputs that diverge from the prompts or contradict previously generated statements." The framework warns that "confabulated logic or citations that purport to justify or explain the system's answer" may "further mislead humans into inappropriately trusting the system's output." Information integrity is defined as distinguishing "fact from fiction, opinion, and inference" with transparency about "its level of vetting." The framework identifies "confabulation" and "information integrity" as key risk areas but "stops short of prescriptive rules" and does not specifically address sycophancy. No legislation specifically targets the sycophancy problem.
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Contradicts — framework does not address sycophancy | Strong |
| H2 | Supports — even the most authoritative US framework does not name sycophancy | Strong |
| H3 | Supports — confabulation is addressed but sycophancy as a distinct risk category is absent | Strong |
Context¶
NIST AI RMF is the primary US government framework for AI risk management. Its absence of sycophancy as a named risk category is highly significant — it means organizations using NIST as their risk framework have no guidance on sycophancy.
Notes¶
The term "confabulation" covers random fabrication but not the directional bias of sycophancy (fabrication toward what the user expects). NIST's note about confabulated "logic or citations that purport to justify" gets closest to sycophancy but does not make the connection explicit.