processed_ke_approved¶
SQL type: SimpleAggregateFunction(anyLast, Nullable(UInt8)) — 0 or 1
Table: keywords.keywords_data_local
Customer-facing: No (internal quality flag)
Last validated: 2026-05-21
What it is¶
The pipeline's quality flag. 1 if the keyword passes every quality check; 0 if any check fires. This is the master "do we trust / display anything for this keyword?" signal — every customer-facing derived field is gated on it.
How it's computed¶
A sequence of AND-combined quality checks in processing.py around L2002–L2025. If any of these fails, processed_ke_approved = 0:
| Check | Rule | Anchor / location |
|---|---|---|
| Length | keyword text ≤ 100 characters | inline check |
| Word count | ≤ 32 words | is_gibberish_query family |
| Not gibberish | is_gibberish_query(keyword) returns False |
processing.py:L210 |
| Not a Google search operator | is_google_search_operator(keyword) returns False |
processing.py:L180 |
| Legit GSC impressions | is_legit_impresions(impressions_array, month_array) returns True |
processing.py:KB-ANCHOR:is-legit-impressions-3mo |
| Not JS-garbage | js_first_seen AND gsc_first_seen aren't both missing |
inline |
| Not invalid-long-with-no-data | filters keywords that are >50 chars AND have no GSC/JS/GT/GKP signal | inline |
| Not zero-GT-with-high-GSC anomaly | filters the specific anomaly pattern where GT is flat-zero but GSC is high (bot trap) | inline |
Gate semantics¶
= 1: keyword is approved.processed_keyword_volume,processed_volume_trend,processed_growth, etc. are all computed normally and may be displayed (subject to their own gates likevolume > 200).= 0: keyword failed approval. Most derived columns are written asnullor0, and the keyword is dropped from PE / customer-facing surfaces.
processed_ke_approved == 1 is a necessary precondition for the volume-200-display-gate — outer_gate_passed = (processed_ke_approved == 1) AND (volume > 200 OR js > 100).
Edge cases¶
- Re-approval after data arrives — a keyword that was previously unapproved (no data) can flip to approved when GSC data lands. The pipeline writes the new
processed_*columns on the next iteration;anyLastsemantics on the table mean the oldnulls are overwritten. - Re-disapproval is rare — once approved, keywords typically stay approved unless their data goes stale beyond the freshness thresholds.
Downstream¶
- Gates every customer-facing field.
- Read by
pe_update.pyfor PE-tracking selection. - Indirectly drives the row-inclusion query: the upload-step query includes
WHERE … OR processed_ke_approved = 1.
See also¶
- Central hub table
- processing.py — producer
is-legit-impressions-3moanchor — one of the gating checks- pe-update — biggest internal consumer