Skip to content

Jumpshot (JS)

Code: kvprocessor.cppPullJSData() L464–L484, PullJSFirstSeenData() L486–L555, pulled at init L1009. Last validated: 2026-05-21

What it carries

Per-keyword absolute monthly search volumes, organic / paid click rates, user counts, and a first_seen date — derived from Jumpshot's historical clickstream panel. The data is frozen (Jumpshot shut down in 2020 and we have a one-time scrape) but valuable because it's the only source of click-stream-grade volume signal we still have. Newer sources (GSC, GT, GKP) don't capture clickstream behaviour the same way.

Where it lives and how it gets here

Two source tables on isog: keywords_volume.js_keywords_volume (main metrics) and jumpshot.keyword_first_seen (first-seen metadata). Both were populated once from a historical Jumpshot export; there is no extract_js.py — see memory project_js_is_stale_source.

kvprocessor.cpp pulls every run regardless (so newly-added shards / ranges pick up the historical data), but the upstream content never changes.

Why JS is still in the pipeline despite being stale

  • High-quality clickstream baseline that no longer exists anywhere. Real user-panel click data of this scope died with Jumpshot. For older keywords, JS is the most authoritative historical volume signal available.
  • Fills coverage where GT / GKP / GSC have nothing. The long tail of pre-2020 keywords has thin or no GSC history; JS gives them a credible anchor.
  • Confidence-coverage scorer treats JS as presence-only — no recency penalty (memory: project_js_is_stale_source). Other sources lose points if stale; JS doesn't, since "stale" is the only state it has.

What JS surfaces in the pipeline

Read by processing.py directly:

JS column Used for processing.py reference
js_organic_p Customer-facing processed_organic_p (with overrides for domain / branded keywords) L1997, L2002
js_users Row-inclusion gate (a keyword is "kept" if js_users > 1, alongside GSC and approval criteria) L1820
js_predicted_volumes, js_months Baseline historical volume — last_12months_js_avg_sv blended into volume merge; long-history floor when GT precedes JS L1975, L2079, L2252–L2260
js_first_seen Keyword age / first-seen heuristic; combined with gsc_first_seen to set processed_first_seen L1916–L1921

Stored in keywords_data_local but not read by processing.py:

  • js_clicks_per_search, js_clicks_per_search_organic, js_clicks_per_search_paid
  • js_searches, js_searches_with_clicks_paid_only, js_searches_with_clicks_organic_only, js_searches_with_clicks_organic_paid

These columns flow through to the upload step (ClickHouse keywords_metrics_local and Elasticsearch) and are surfaced to customers directly — they're not transformed by processing.py. The exact customer-facing surface is documented in Consumers (Phase 5).

See also

  • Central hub tablejs_* columns
  • kvprocessorPullJSData() / PullJSFirstSeenData()
  • Decisions — JS-presence influence on GKP all-zero acceptance, JS-vs-GSC volume blend, organic-% overrides
  • Consumers — which JS columns the upload step pushes to customer-facing stores (Phase 5)
  • Column-level reference — TBD: ../fields/