C002 — OpenSSF Scorecard Average of 5.4 Out of 10 — Assessment¶

Contents¶

Evidence Synthesis
Probability Assessment
Evidence Gaps

The claim conflates two separate facts: OpenSSF scans 1 million critical projects weekly (confirmed), and the average Scorecard score is 5.4 (confirmed for Chainguard's analysis of 1,511 Wolfi packages). No source combines these into the claim as stated. The 5.4 figure is plausible for the broader ecosystem (Chainguard notes it is 'typical') but is not confirmed for the 1M critical project population.

Evidence Synthesis¶

Evidence quality: Medium — The primary source for the 5.4 figure is a Chainguard corporate blog post analyzing 1,511 Wolfi packages — a medium-reliability source with relevant but non-peer-reviewed analysis. The OpenSSF Scorecard GitHub README confirms the 1M project weekly scan but does not publish aggregate averages. No peer-reviewed academic source was found reporting a 5.4 average across the full 1M dataset.

Source agreement: Medium — Only one source (Chainguard) reports the specific 5.4 figure, so agreement cannot be properly assessed. The Chainguard authors state that 'past research suggests that these scores are typical' and that scores 'between four and six' are common, providing indirect corroboration but not independent confirmation of the exact 5.4 mean.

Independence: Limited independence. The 5.4 figure comes from a single source (Chainguard). The OpenSSF repository confirms the scanning infrastructure exists but does not report aggregate statistics. The claim of independence cannot be evaluated because there is only one primary source for the specific figure.

Probability Assessment¶

C002-H1: Unlikely (20-45%)
The 5.4 average was not found in any OpenSSF publication or academic study of the full 1M project dataset. It comes from Chainguard's analysis of a much smaller (1,511 packages) and different population (Wolfi upstream packages). The claim that this is 'the average across the top one million critical projects' is not supported by the evidence.
C002-H2: Roughly even chance (45-55%)
The 5.4 figure exists and is plausible, but it describes a different population than claimed. Without aggregate data from OpenSSF's BigQuery dataset, the actual average across 1M projects is unknown. It could be close to 5.4 or substantially different.
C002-H3: Likely (55-80%)
This hypothesis best fits the evidence. The 5.4 figure is accurate for Chainguard's Wolfi sample (1,511 packages). OpenSSF does scan 1M projects weekly and publishes to BigQuery, but does not report an aggregate average. The researcher has likely conflated the 5.4 figure from Chainguard's analysis with the OpenSSF 1M project dataset. Verdict: The claim is Unlikely as stated (20-45%). The 5.4 figure is real but comes from Chainguard's analysis of 1,511 Wolfi packages, not from OpenSSF's scan of one million critical projects. The number of projects tracked by OpenSSF (1M) and the score (5.4) are both individually accurate but attributed to the wrong combination.

Evidence Gaps¶

Expected but not found: - No OpenSSF publication reporting an aggregate average Scorecard score across the 1M critical project dataset was found. - No academic paper computing the mean Scorecard score from the BigQuery dataset was found. - The arxiv.org paper on Scorecard design (candidate evidence) was not successfully fetched due to PDF format.

Unanswered questions: - What is the actual mean Scorecard score across OpenSSF's 1M critical project dataset? - Has the Scorecard average changed over time as more projects adopt security best practices?

Impact on confidence: The absence of aggregate data from the primary 1M project dataset is the critical gap. Without this, the assessment relies on inference from a much smaller Wolfi sample. If the OpenSSF BigQuery data were queried, it could confirm or refute the 5.4 figure for the full population, potentially moving this assessment to high confidence in either direction.

← Back to item overview