Skip to content

processed_volume_trend

SQL type: SimpleAggregateFunction(anyLast, Nullable(String)) — JSON-serialised {"YYYYMM": volume, …} Table: keywords.keywords_data_local Last validated: 2026-05-21

What it is

The monthly time-series chart you see in the product, plus a 12-month forecast appended to the right edge. JSON dict keyed by YYYYMM string, value is an integer volume. Covers everything from trend_start_yyyymo (typically 2015-09) through today_yyyymo + 12 (the forecast horizon).

How it's computed

The full Stage 2 derivation lives in processing.py around the central while/for loop. Roughly:

  1. Blend JS + GSC into combined_volume_data taking max(JS, GSC) per month.
  2. Bot-spike repair — replace uncorroborated GSC spikes with GT-implied values (processing.py:KB-ANCHOR:gsc-spike-factor).
  3. GKP blend — apply processing.py:KB-ANCHOR:gkp-blend-routing (<0.1 proportion) and gkp-blend-weights (0.8/0.2).
  4. GT blend — apply processing.py:KB-ANCHOR:gt-blend-weights (0.8/0.2), with gt-2x-soft-cap reverting blow-ups, gt-collapse-override for collapsing peaks, and gt-gkp-overlap-cut trimming bad GT.
  5. Incomplete-month blend — quadratic-completeness extrapolation for the current month (processing.py:KB-ANCHOR:incomplete-month-blend-quadratic).
  6. Forecast — 12-month forecast via the trend-type-routed model (processing.py:KB-ANCHOR:adi-cv2-trend-type-routing → AutoCES / Holt-Winters / CrostonSBA / TSB), with forecast-outlier-filter, seasonal-corr-alt-gate, forecast-model-ranking-cascade, and forecast-decay-floor shaping the result.
  7. Serialise the merged + forecast dict as JSON and write.

Gates

  • Only built when outer_gate_passed = True — i.e. processed_ke_approved == 1 AND volume > 200 OR JS-avg > 100 (processing.py:KB-ANCHOR:volume-200-display-gate).
  • Forecast is appended only when is_predict_valid — requires either GT or GKP data plus a non-empty combined_volume_data_sorted.

Schema

{
  "201509": 12000,
  "201510": 13400,
  
  "202604": 28000,
  "202605": 27500,    // last observed month
  "202606": 27800,    // forecast starts
  "202607": 28100,
  
  "202705": 26900     // 12-month horizon
}

There is no flag inside the JSON distinguishing observed from forecast — the boundary is the implicit "today" mark known to the renderer.

Edge cases

  • processed_volume_trend_meta.last_gt_yyyymo / last_gkp_yyyymo mark the boundary above which observed values transition to GT-blended-only. The prior-trend shadow logic (frozen above the boundary) prevents bot-corrected months from flapping between iterations.
  • Sub-gate keywords (volume ≤ 200 AND JS-avg ≤ 100) write null here — no chart rendered.
  • Isolated GT peak (processing.py:KB-ANCHOR:isolated-gt-peak-forecast-skip) suppresses the forecast portion only; observed months are still written.

Downstream

  • The product chart renders this directly.
  • processed_growth is computed over this series.
  • processed_keyword_volume is the 12-month rolling average of the observed portion.
  • Global aggregation sums per-country versions into processed_global_volume_trend.

See also