Jumpshot (JS)¶
Code: kvprocessor.cpp — PullJSData() L464–L484, PullJSFirstSeenData() L486–L555, pulled at init L1009.
Last validated: 2026-05-21
What it carries¶
Per-keyword absolute monthly search volumes, organic / paid click rates, user counts, and a first_seen date — derived from Jumpshot's historical clickstream panel. The data is frozen (Jumpshot shut down in 2020 and we have a one-time scrape) but valuable because it's the only source of click-stream-grade volume signal we still have. Newer sources (GSC, GT, GKP) don't capture clickstream behaviour the same way.
Where it lives and how it gets here¶
Two source tables on isog: keywords_volume.js_keywords_volume (main metrics) and jumpshot.keyword_first_seen (first-seen metadata). Both were populated once from a historical Jumpshot export; there is no extract_js.py — see memory project_js_is_stale_source.
kvprocessor.cpp pulls every run regardless (so newly-added shards / ranges pick up the historical data), but the upstream content never changes.
Why JS is still in the pipeline despite being stale¶
- High-quality clickstream baseline that no longer exists anywhere. Real user-panel click data of this scope died with Jumpshot. For older keywords, JS is the most authoritative historical volume signal available.
- Fills coverage where GT / GKP / GSC have nothing. The long tail of pre-2020 keywords has thin or no GSC history; JS gives them a credible anchor.
- Confidence-coverage scorer treats JS as presence-only — no recency penalty (memory:
project_js_is_stale_source). Other sources lose points if stale; JS doesn't, since "stale" is the only state it has.
What JS surfaces in the pipeline¶
Read by processing.py directly:
| JS column | Used for | processing.py reference |
|---|---|---|
js_organic_p |
Customer-facing processed_organic_p (with overrides for domain / branded keywords) |
L1997, L2002 |
js_users |
Row-inclusion gate (a keyword is "kept" if js_users > 1, alongside GSC and approval criteria) |
L1820 |
js_predicted_volumes, js_months |
Baseline historical volume — last_12months_js_avg_sv blended into volume merge; long-history floor when GT precedes JS |
L1975, L2079, L2252–L2260 |
js_first_seen |
Keyword age / first-seen heuristic; combined with gsc_first_seen to set processed_first_seen |
L1916–L1921 |
Stored in keywords_data_local but not read by processing.py:
js_clicks_per_search,js_clicks_per_search_organic,js_clicks_per_search_paidjs_searches,js_searches_with_clicks_paid_only,js_searches_with_clicks_organic_only,js_searches_with_clicks_organic_paid
These columns flow through to the upload step (ClickHouse keywords_metrics_local and Elasticsearch) and are surfaced to customers directly — they're not transformed by processing.py. The exact customer-facing surface is documented in Consumers (Phase 5).
See also¶
- Central hub table —
js_*columns - kvprocessor —
PullJSData()/PullJSFirstSeenData() - Decisions — JS-presence influence on GKP all-zero acceptance, JS-vs-GSC volume blend, organic-% overrides
- Consumers — which JS columns the upload step pushes to customer-facing stores (Phase 5)
- Column-level reference — TBD:
../fields/