processed_volume_trend_meta¶
SQL type: SimpleAggregateFunction(anyLast, Nullable(String)) — JSON
Table: keywords.keywords_data_local
Customer-facing: Partial — confidence subscore is shown; everything else is internal-only.
Last validated: 2026-05-21
What it is¶
A composite metadata blob attached to every approved keyword. Holds whatever per-keyword annotation the pipeline needs to remember between iterations without committing to a new column on the central hub.
This field is deliberately a "junk drawer" — adding a new dimension to track typically means adding a key here rather than altering the keywords.keywords_data_local schema. That's a deliberate trade: keep the table schema stable (the table has 200 × 256 partitions and any DDL change is operationally expensive), pay the cost of less-typed JSON storage.
For that reason this field has its own sub-section in the KB, with one page per family of keys.
The three current families¶
| Family | Page | What it carries |
|---|---|---|
| Refresh-detection signals | refresh-signals | needs_gt, needs_gkp, tier assignments, per-signal evidence (surge / staleness / divergence / missing / shape-mismatch). Drives extract_gt.py and extract_gkp.py. |
| Confidence subscores | confidence-subscores | 0–100 score + coverage / agreement / freshness / forecast subscores. Memory project_confidence_meta_is_stale flags that stored values may use the old formula. |
| Trend-shape artefacts | trend-shape | trend_type, adi, cv_sq, bot_spike_months, gsc_position_match, prior_volume_trend (frozen shadow), last_gt_yyyymo, last_gkp_yyyymo, meta_written_at. |
How experimentation slots in¶
If you want to track a new per-keyword signal without adding a column:
- Decide which family it fits (or whether it warrants a new family).
- Add the key inside
processing.py's meta-write path —detect_refresh_needs()for refresh signals, the confidence-score helpers for confidence, inline in the main loop for trend-shape. - Document it on the matching page below. Add a new page only if you're starting a new family.
- Optional: extend the demo app or other consumers to read the new key.
The field already carries enough volume that adding a few new keys per experiment is free.
Overall schema¶
The actual JSON is a single flat dict combining keys from all three families. Example with one-or-two keys per family — see family pages for full lists:
{
"needs_gt": false,
"gt_tier": 0,
"needs_gkp": true,
"gkp_tier": 2,
"confidence_score": 78,
"confidence_breakdown": { "coverage": 22, "agreement": 11, "freshness": 19, "forecast": 26 },
"trend_type": "smooth",
"adi": 1.05,
"cv_sq": 0.12,
"bot_spike_months": {},
"last_gt_yyyymo": 202604,
"last_gkp_yyyymo": 202603,
"meta_written_at": "2026-05-21T13:42:00Z"
}
See also¶
- Central hub table — column lives here
- processing.py — producer
- extract_gt · extract_gkp — read the refresh flags
- Archive:
_archive/confidence_score.md,_archive/refresh_detection.md— original detailed writeups; Phase 3 will decompose into decision nodes