`processed_volume_trend_meta` — confidence subscores¶

The second of three families inside processed_volume_trend_meta. Holds the 0–100 confidence score that's shown to customers, plus its four subscore breakdown.

Keys¶

Key	Type	Meaning
`confidence_score`	`int` 0–100	The aggregated score shown in the product
`confidence_breakdown.coverage`	`int` 0–30	How many of the 4 sources have data for this keyword, weighted
`confidence_breakdown.agreement`	`int` 0–15	How consistent the sources' magnitudes / directions are
`confidence_breakdown.freshness`	`int` 0–25	How recently each source was refreshed
`confidence_breakdown.forecast`	`int` 0–30	Observed-fraction + normalised SMAPE of the forecast model

The four subscores sum (weighted-geometrically) to the headline confidence_score.

How it's computed¶

In processing.py's confidence-score helpers (around L891–L965). The score is a weighted geometric mean of the four subscore fractions, with a floor to prevent any single 0 from collapsing the total to ≈0.

Anchors that govern this:

processing.py:KB-ANCHOR:coverage-source-weighting — the 30-pt coverage subscore weights (GSC + 2 trend > GSC + 1 trend > GSC-only). Missing a trend source is penalised harder than the count alone would suggest.
processing.py:KB-ANCHOR:freshness-staleness-ladder — the per-source per-month staleness ladders (GT: [(1, 9), (3, 8), (6, 6), (12, 4), (18, 2)]; GKP: [(2, 7), (4, 6), (6, 4), (12, 2)]; GSC: flat 9).
processing.py:KB-ANCHOR:confidence-subscore-floor — _CW_GEOM_FLOOR = 0.05. The floor that prevents a single zero subscore from killing the total.

Important caveat: stored values may be stale¶

Per memory project_confidence_meta_is_stale: the stored confidence_score may have been written by an older formula. Don't trust the stored number for any analysis; re-compute via predict_sv_trends if you need current-formula values.

This caveat exists because the confidence-score formula has evolved (the archived _archive/confidence_score.md documents the most-recent changes — removed Pearson, removed divergence penalty, rebalanced weights) and the stored values lag any formula change until the next full re-write of the meta. For most-recent keywords this isn't an issue (they get re-written hourly), but long-tail keywords whose meta hasn't been touched in months may still carry old-formula values.

Edge cases¶

JS-only keywords — coverage subscore caps at _CW_COV_1_TREND_NO_GSC = 4. Confidence is low by design.
Forecast subscore for sub-gate keywords — _CW_FCST_NO_PREDICT_WITH_DATA = 0 when GT/GKP exist but is_predict_valid=False. Memory project_no_forecast_with_gt_gkp_is_low_conf covers this rule.
GSC-fresh-always — _CW_FRESH_GSC_PRESENT = 9 regardless of last_update. Memory project_gsc_is_always_fresh covers this: GSC is treated as always-fresh when present because the pipeline guarantees its recency.

Customer-facing surface¶

The headline confidence_score is uploaded to ES and shown in the product as a quality indicator on each keyword. The subscore breakdown is not displayed directly; it lives in this meta JSON for debugging and internal analysis.

processed_volume_trend_meta — confidence subscores¶