Skip to content

C002 — OpenSSF Scorecard 5.4 Average: Wrong Population — Assessment

Contents

The claim misattributes the 5.4 average to 'the top one million critical open source projects' when it actually comes from a Chainguard analysis of approximately 1,511 Wolfi upstream repositories. The 5.4 figure may be a reasonable rough estimate of typical Scorecard scores across open source — Chainguard notes scores in the 4-6 range are 'typical' based on past research — but the specific population attribution is incorrect. The researcher should cite the Chainguard analysis directly or verify against the OpenSSF BigQuery dataset.

Evidence Synthesis

Evidence quality: Medium — The 5.4 figure traces to a single corporate blog analysis (Chainguard) of 1,511 Wolfi-associated repos, not to OpenSSF official aggregate data. No peer-reviewed study was found confirming this as the average for the full 1M project scan.

Source agreement: Medium — The Chainguard analysis reports 5.4 and notes it is 'typical' based on past research. The OpenSSF GitHub repo confirms scanning 1M projects weekly. However, no source provides the official mean score for the full 1M population.

Independence: The Chainguard analysis is the only source reporting the specific 5.4 figure. OpenSSF's own publications describe the Scorecard methodology but do not publish aggregate statistics.

Outliers

  • https://www.chainguard.dev/unchained/wolfis-upstream-security-inspection-scanning-with-openssf-scorecard: Reports 5.4 as the average for 1,511 Wolfi repos, not 1M critical projects — The claim incorrectly attributes this figure to 'the top one million critical open source projects.' The actual measurement was on a smaller, different population (Wolfi upstream packages).

Probability Assessment

  • H1: Unlikely (20-45%)
  • The specific claim — that the average across the 'top one million critical open source projects' is 5.4 — is not supported. The 5.4 figure comes from a different, smaller population (1,511 Wolfi repos). The mean for the full 1M population is unknown from our evidence.
  • H2: Roughly even chance (45-55%)
  • The Chainguard analysis notes the 5.4 is 'typical' for open source, and the bell-shaped distribution is consistent. However, Scorecard v5.0.0 expanded from 18 to 47 checks, which could change aggregate scores. We cannot determine whether the current mean for 1M projects differs materially.
  • H3: Very unlikely (05-20%)
  • The Chainguard data explicitly shows a bell-shaped (normal) distribution, directly refuting the bimodal hypothesis. However, this was for Wolfi repos and may not generalize. Verdict: The claim is unlikely (20-45%) as stated because it misattributes the 5.4 average to 'the top one million critical open source projects' when it actually comes from a Chainguard analysis of ~1,511 Wolfi upstream repositories. The 5.4 figure may be a reasonable estimate of typical Scorecard scores across open source, but the specific population attribution is incorrect.

Evidence Gaps

Expected but not found: - Official OpenSSF publication of aggregate Scorecard statistics for the 1M critical projects scan. - Academic analysis of the full BigQuery Scorecard dataset showing mean and distribution.

Unanswered questions: - What is the actual mean Scorecard score across the full 1M critical projects population? - How have Scorecard scores changed with the v5.0.0 methodology update?

Impact on confidence: The gap between the cited population (1M projects) and the actual source population (1,511 repos) significantly reduces confidence in the claim as stated. The researcher should either verify against BigQuery data or restate the claim with correct attribution.

← Back to item overview