Skip to content

processed_volume_trend_meta

SQL type: SimpleAggregateFunction(anyLast, Nullable(String)) — JSON Table: keywords.keywords_data_local Customer-facing: Partial — confidence subscore is shown; everything else is internal-only. Last validated: 2026-05-21

What it is

A composite metadata blob attached to every approved keyword. Holds whatever per-keyword annotation the pipeline needs to remember between iterations without committing to a new column on the central hub.

This field is deliberately a "junk drawer" — adding a new dimension to track typically means adding a key here rather than altering the keywords.keywords_data_local schema. That's a deliberate trade: keep the table schema stable (the table has 200 × 256 partitions and any DDL change is operationally expensive), pay the cost of less-typed JSON storage.

For that reason this field has its own sub-section in the KB, with one page per family of keys.

The three current families

Family Page What it carries
Refresh-detection signals refresh-signals needs_gt, needs_gkp, tier assignments, per-signal evidence (surge / staleness / divergence / missing / shape-mismatch). Drives extract_gt.py and extract_gkp.py.
Confidence subscores confidence-subscores 0–100 score + coverage / agreement / freshness / forecast subscores. Memory project_confidence_meta_is_stale flags that stored values may use the old formula.
Trend-shape artefacts trend-shape trend_type, adi, cv_sq, bot_spike_months, gsc_position_match, prior_volume_trend (frozen shadow), last_gt_yyyymo, last_gkp_yyyymo, meta_written_at.

How experimentation slots in

If you want to track a new per-keyword signal without adding a column:

  1. Decide which family it fits (or whether it warrants a new family).
  2. Add the key inside processing.py's meta-write path — detect_refresh_needs() for refresh signals, the confidence-score helpers for confidence, inline in the main loop for trend-shape.
  3. Document it on the matching page below. Add a new page only if you're starting a new family.
  4. Optional: extend the demo app or other consumers to read the new key.

The field already carries enough volume that adding a few new keys per experiment is free.

Overall schema

The actual JSON is a single flat dict combining keys from all three families. Example with one-or-two keys per family — see family pages for full lists:

{
  "needs_gt": false,
  "gt_tier": 0,
  "needs_gkp": true,
  "gkp_tier": 2,
  "confidence_score": 78,
  "confidence_breakdown": { "coverage": 22, "agreement": 11, "freshness": 19, "forecast": 26 },
  "trend_type": "smooth",
  "adi": 1.05,
  "cv_sq": 0.12,
  "bot_spike_months": {},
  "last_gt_yyyymo": 202604,
  "last_gkp_yyyymo": 202603,
  "meta_written_at": "2026-05-21T13:42:00Z"
}

See also

  • Central hub table — column lives here
  • processing.py — producer
  • extract_gt · extract_gkp — read the refresh flags
  • Archive: _archive/confidence_score.md, _archive/refresh_detection.md — original detailed writeups; Phase 3 will decompose into decision nodes