R0048/2026-04-01/Q002/SRC06/E01¶
NHS automation bias coverage — closest training to sycophancy-adjacent content
Extract¶
The NHS framework addresses: - "Awareness of the risk of being under- or over-confident in AI-derived information" - "Awareness that cognitive biases including automation bias and rejection bias can affect decision-making with AI" - Principle that AI should "augment professional judgement, not replace it"
This is the closest any training framework comes to addressing sycophancy-related risks: - Automation bias (passive overtrust) is related but distinct from sycophancy (active AI agreement) - Over-confidence in AI-derived information is relevant but does not address the mechanism of why AI produces agreeable outputs - Rejection bias (the opposite — distrusting AI) shows a more nuanced understanding than most frameworks
What is missing: no mention of AI systems actively aligning outputs with user expectations, no mention of RLHF-driven agreement behavior, no mention of sycophancy.
Relevance to Hypotheses¶
| Hypothesis | Relationship | Strength |
|---|---|---|
| H1 | Contradicts | Automation bias is adjacent but not sycophancy |
| H2 | Supports | Best example of adjacent-concept coverage; confirms the partial but incomplete picture |
| H3 | Contradicts | NHS does address related concepts, preventing full H3 support |
Context¶
The NHS framework's mention of automation bias and rejection bias demonstrates the most sophisticated understanding of human-AI interaction biases found in any training framework. However, the critical gap remains: it treats the human as the source of bias (trusting too much or too little) rather than addressing the AI as an active agent that shapes user beliefs through agreement behavior.
Notes¶
The distinction between automation bias and sycophancy is conceptually important: automation bias is about the human tendency to trust automated outputs; sycophancy is about the AI tendency to produce outputs designed to be trusted. They are complementary risks, but addressing only the human side leaves the AI side unexamined.