Forecast-divergence refresh signal¶
Third of the five refresh signals. Fires when the blended trend (the model's view of "what's really happening") drifts too far from raw GSC impressions (the ground-truth recent signal). A divergence means GT/GKP-driven smoothing is moving the published curve away from observed GSC — fresh GT/GKP can resolve the conflict.
What it is¶
Signal name: "divergence". Fires when, across the 3 most-recent months that are present in both the blended series and raw GSC, the relative gap between their averages exceeds 30%.
How it's computed¶
At processing.py:KB-ANCHOR:refresh-signal-forecast-divergence:
common_months = sorted(set(combined_volume_data_sorted.keys())
& set(impressions_dict_raw.keys()))
if len(common_months) >= 3 and processed_keyword_volume >= 100:
last_3 = common_months[-3:]
forecast_avg = mean(combined_volume_data_sorted[m] for m in last_3)
raw_avg = mean(impressions_dict_raw[m] for m in last_3)
divergence = abs(forecast_avg - raw_avg) / raw_avg
if divergence > 0.3:
strength = min(divergence, 5.0)
| Side | Component |
|---|---|
| GT | min(2.0, strength × 3.0) |
| GKP | min(2.5, strength × 4.0) |
GKP gets the larger contribution because GKP is the most-likely source of the smoothing pulling the blend away from GSC — refreshing GKP often resolves the divergence.
Why this choice¶
Empirically, 30% is the noise floor. Below that, blended-vs-raw GSC differences are dominated by GT smoothing of normal month-to-month noise rather than a meaningful disagreement. Above 30%, the gap usually traces to one specific cause: stale GKP holding the blend down on a real surge, a bot spike that GT correctly suppressed, or a seasonal turn the model lagged.
The volume ≥ 100 floor is stricter than the surge signal's ≥ 50 floor: relative-divergence ratios are even more volatile on small-volume keywords than relative-change ratios, so a higher floor avoids garbage-triggered refreshes.
The asymmetric 3.0× / 4.0× multipliers on GT/GKP components (vs the surge signal's 1.0×) reflect that a confirmed divergence is a higher-quality signal than a raw GSC swing — the swing might be GSC noise, but a divergence after blending means something in the pipeline is contradicting something else.
Edge cases¶
- Forecasted-only months — when
combined_volume_data_sortedcontains months past the GSC tail (forecasted), they're naturally excluded from the comparison because the intersection-of-keys filter drops them. - Newly added keyword — fewer than 3 overlapping months suppresses the signal even if relative divergence is huge. Real divergence on a young keyword waits for the 3rd month.
- Brand keywords with high GT smoothing — high-volume brands often see a 5–10% blended-vs-GSC gap as a steady-state because GT compresses brand peaks. The 30% floor leaves that alone.
See also¶
- GSC surge refresh signal — the other "GSC says something is up" signal
- GT 2× soft cap (Sub-batch 3d) — the blending decision that most often drives divergence
- Refresh priority formula
- Archive:
_archive/refresh_detection.md