Skip to content

Processes

Programs and scripts that move data through the pipeline. One page per process.

Status: Phase 2 placeholder

Detailed pages will be added in Phase 2.

Planned entries:

  • kvprocessor.cpp — Stage 1 ingestion (pulls raw data from remote CH hosts into keywords.keywords_data_local).
  • processing.py — Stage 2 derivation (computes all processed_* columns). This is the heart of the pipeline.
  • pe_update.py — Stage 3 PE tracking selection.
  • upload_backend.sh + Slurm jobs — Stage 4 distribution to keywords_metrics_local and Elasticsearch.
  • extract_gt.py / extract_gkp.py — refresh extraction (independent cadence, signal-driven).
  • keep_running.sh — top-level orchestrator chaining stages 1-4 hourly under screen.