Skip to content

R0021/2026-03-25/Q004 — Assessment

BLUF

All three regulated industries (aviation, healthcare, financial services) have established validation frameworks for AI systems, but all acknowledge that these frameworks were designed for traditional systems and require adaptation. The FAA released its AI Safety Assurance Roadmap v1 in 2024, acknowledging that "rigorous safety assurance methods must be developed." The FDA requires 10 Good Machine Learning Practice principles and predetermined change control plans. Banking regulators apply SR 11-7 model risk management but recognize its limitations for self-adapting AI models. NIST provides a voluntary risk management framework with four core functions.

Probability

Rating: Almost certain (95-99%) that frameworks exist; Very likely (80-95%) that they have significant gaps for modern AI.

Confidence in assessment: High

Confidence rationale: Evidence from primary regulatory sources (FAA, FDA, Federal Reserve). These are published government documents.

Reasoning Chain

  1. FAA released AI Safety Assurance Roadmap v1 (July 2024) acknowledging need for new methods specific to AI [SRC01-E01, High reliability, High relevance]
  2. FDA requires GMLP 10 principles + predetermined change control plans for AI medical devices; 97% cleared via 510(k) pathway [SRC02-E01, High reliability, High relevance]
  3. SR 11-7 (2011) requires 4-perspective model validation but was designed for stable models; "may lose effectiveness" for self-adapting AI [SRC03-E01, High reliability, High relevance]
  4. NIST AI RMF provides voluntary 4-function framework (GOVERN, MAP, MEASURE, MANAGE) with TEVV practices [SRC04-E01, High reliability, High relevance]
  5. JUDGMENT: The pattern across all three industries is the same — existing frameworks being stretched to cover AI, with acknowledged gaps. This is the opposite of the prompt engineering situation, where no formal framework exists at all.

Evidence Base Summary

Source Description Reliability Relevance Key Finding
SRC01 FAA AI Roadmap High High "Rigorous methods must be developed"
SRC02 FDA SaMD Guidance High High GMLP + change control plans
SRC03 SR 11-7 High High Traditional validation struggling with adaptive AI
SRC04 NIST AI RMF High High Voluntary framework with TEVV

Collection Synthesis

Dimension Assessment
Evidence quality Robust — all primary regulatory sources
Source agreement High — all agree frameworks exist but need adaptation
Source independence Independent — four separate regulatory bodies
Outliers None

Detail

The contrast with prompt engineering is stark. These industries have validation frameworks, testing requirements, regulatory oversight, and enforcement mechanisms for AI — even though they acknowledge the frameworks are imperfect. Prompt engineering has none of these.

Gaps

Missing Evidence Impact on Assessment
EU AI Act implementation details Minor — would add European perspective
Specific test case requirements Moderate — would strengthen the "how" of validation
Industry compliance rates Moderate — would show enforcement effectiveness

Researcher Bias Check

Declared biases: The researcher may emphasize the existence of these frameworks to contrast with prompt engineering's lack of frameworks.

Influence assessment: The frameworks genuinely exist and are documented. The contrast is factual.

Cross-References

Entity ID File
Hypotheses H1, H2, H3 hypotheses/
Sources SRC01-SRC04 sources/
ACH Matrix ach-matrix.md
Self-Audit self-audit.md