Intermittent / sparse demand methods¶
Forecasters built for series with many zero-months and a few large bursts — the regime our _is_spiky_series heuristic (processing.py:1180, processing.py:1310) currently suppresses. These methods forecast the burstiness instead.
Demand classification (Syntetos-Boylan-Croston quadrant) is covered as its own entry at the bottom — read it first if you are picking between Croston, SBA, TSB, ADIDA, and IMAPA. Briefly: classify by ADI (average demand interval, in periods between non-zero observations) and CV² (squared coefficient of variation of non-zero demand sizes). The four quadrants are smooth (low ADI, low CV² — use SES/ETS), intermittent (high ADI, low CV² — use Croston/SBA), erratic (low ADI, high CV² — high-variance smooth), and lumpy (high ADI, high CV² — the worst case; try TSB or aggregation methods).
title: Croston's method tags: [intermittent, smoothing, baseline] applies_to: [tier_2, tier_3] data_needs: "≥1 non-zero observation; degrades gracefully with many zeros; no seasonal handling" status: candidate
Croston's method¶
Source: Croston, J.D. (1972), "Forecasting and Stock Control for Intermittent Demands," Operational Research Quarterly 23(3):289-303 Link: https://link.springer.com/article/10.1057/jors.1972.50 Retrieved: 2026-05-15
What it is: Separates an intermittent series into two parts and smooths each with simple exponential smoothing (SES). The demand-size SES updates only in periods with a non-zero observation, smoothing the magnitude. The inter-arrival SES smooths the gap (in periods) between non-zero observations. The forecast per period is (smoothed demand size) / (smoothed inter-arrival interval), i.e., the expected per-period rate. Output is a flat horizon — same value for every future month — but it is a correct per-month expectation rather than a noisy MoM extrapolation off a zero.
When to use: - Series with ADI between 1.32 and ~3 and modest demand-size variability (intermittent quadrant). - As a first-cut baseline before reaching for SBA / TSB / aggregation methods.
Fit for our model:
- ✅ Direct replacement for the suppression logic at processing.py:1180 (_is_spiky_series) and processing.py:1310 — instead of dropping growth tags for sparse keywords, we'd report a Croston rate as the level.
- ✅ Already in our statsforecast stack as CrostonClassic — zero new dependencies. Could be added to the ensemble at processing.py:1984 with a gate that selects it on series classified as intermittent.
- ⚠ Known positive bias on lumpy data — see SBA for the correction.
- ⚠ Flat-line forecast — no trend, no seasonality. Pair with seasonal naive for the shape if the keyword has a recurring annual peak.
- 🔧 from statsforecast.models import CrostonClassic. Use demand classification below to gate which series get Croston vs. Holt-Winters.
title: SBA (Syntetos-Boylan Approximation) tags: [intermittent, smoothing, bias-corrected] applies_to: [tier_2, tier_3] data_needs: "same as Croston (≥1 non-zero observation)" status: candidate
SBA (Syntetos-Boylan Approximation)¶
Source: Syntetos & Boylan (2001, 2005), "The accuracy of intermittent demand estimates," International Journal of Forecasting 21(2):303-314 Link: https://www.sciencedirect.com/science/article/abs/pii/S0169207004000792 Retrieved: 2026-05-15
What it is: A bias-corrected version of Croston. Croston's expected demand has a positive bias when the inter-arrival smoothing parameter β > 0; SBA multiplies Croston's output by (1 − β/2) to deflate it. The fix is small but consistently reduces forecast error on real intermittent series, and is the recommended default over plain Croston in most empirical studies.
When to use: - Whenever you would use Croston — SBA is a strict improvement unless you specifically want to over-estimate (e.g., safety-stock context, which is not us). - Particularly important for lumpy series (high CV², high ADI) where the Croston bias is largest.
Fit for our model:
- ✅ Preferred default over Croston for the sparse-keyword path that currently triggers _is_spiky_series at processing.py:1180.
- ✅ Same statsforecast API — replace CrostonClassic with CrostonSBA in the ensemble at processing.py:1984.
- ⚠ Still a flat-line forecast — does not address seasonality or trend, only the level bias.
- ⚠ If you want auto-tuned smoothing constants instead of the textbook α=β=0.1, use CrostonOptimized.
- 🔧 from statsforecast.models import CrostonSBA — see Nixtla docs.
title: ADIDA (Aggregate-Disaggregate Intermittent Demand Approach) tags: [intermittent, temporal-aggregation, single-level] applies_to: [tier_2, tier_3] data_needs: "enough history that an aggregated bucket (e.g., quarterly) has ≥4 observations" status: candidate
ADIDA (Aggregate-Disaggregate Intermittent Demand Approach)¶
Source: Nikolopoulos, Syntetos, Boylan, Petropoulos, Assimakopoulos (2011), "An Aggregate-Disaggregate Intermittent Demand Approach (ADIDA) to forecasting," Journal of the Operational Research Society 62(3):544-554 Link: https://doi.org/10.1057/jors.2010.32 Retrieved: 2026-05-15
What it is: Temporal aggregation as a noise filter. Bucket the series at a coarser frequency (e.g., quarterly = 3 monthly periods, or aggregation level set to the mean inter-arrival interval), which collapses many zeros into non-zero buckets and lets a standard forecaster (SES, ETS) work. Forecast at the aggregated level, then disaggregate the forecast back to the original monthly frequency using equal weights (1/N per period in the bucket).
When to use: - Sparse monthly series where the quarterly rate is well-behaved — e.g., a keyword that gets ~12 monthly impressions per year scattered randomly. - When you want to use a familiar smooth-data forecaster (SES, ETS) but the raw series is too zero-heavy.
Fit for our model:
- ✅ Useful for sparse Tier 3 keywords where the rolling-growth at processing.py:1169 is dragged around by zero-months — aggregate to quarterly, fit cleanly, disaggregate.
- ✅ Quarterly aggregation aligns naturally with our annual seasonality — could improve the detect_seasonality() ACF (processing.py:1130) on noisy monthly data.
- ⚠ Equal-weight disaggregation loses any intra-quarter pattern — if the keyword has a "spike in week 4 of each quarter," ADIDA flattens it.
- ⚠ Aggregation level is a hyperparameter — see IMAPA for the multi-level version that side-steps the choice.
- 🔧 from statsforecast.models import ADIDA — see Nixtla docs.
title: IMAPA (Intermittent Multiple Aggregation Prediction Algorithm) tags: [intermittent, temporal-aggregation, multi-level, ensemble] applies_to: [tier_2, tier_3] data_needs: "enough history for several aggregation levels (e.g., 24 monthly = 8 quarterly = 2 yearly)" status: candidate
IMAPA (Intermittent Multiple Aggregation Prediction Algorithm)¶
Source: Petropoulos & Kourentzes (2015), "Forecast combinations for intermittent demand," Journal of the Operational Research Society 66(6):914-924 Link: https://doi.org/10.1057/jors.2014.62 Retrieved: 2026-05-15
What it is: ADIDA at several aggregation levels simultaneously (e.g., monthly, bi-monthly, quarterly, semi-annual), each with SES, and the per-period forecasts are averaged. Multi-level aggregation captures both short-term (low-aggregation) and long-term (high-aggregation) dynamics, and the averaging is a poor man's ensemble that hedges against the wrong choice of aggregation level in ADIDA.
When to use: - When you would use ADIDA but cannot confidently pick the right aggregation level. - For lumpy series where different scales reveal different signals.
Fit for our model:
- ✅ Stronger drop-in than ADIDA for the same processing.py:1180 (sparse / spiky) gate.
- ✅ Computationally still cheap — statsforecast runs all levels in C-extension speed.
- ⚠ Like ADIDA, equal-weight disaggregation loses sub-bucket structure.
- ⚠ With ≤12 monthly observations, the higher aggregation levels collapse to one or two points — falls back to ADIDA-at-monthly.
- 🔧 from statsforecast.models import IMAPA — see Nixtla docs.
title: TSB (Teunter-Syntetos-Babai) tags: [intermittent, obsolescence, probability-update] applies_to: [tier_2, tier_3] data_needs: "≥1 non-zero observation; especially valuable when demand may stop" status: candidate
TSB (Teunter-Syntetos-Babai)¶
Source: Teunter, Syntetos, Babai (2011), "Intermittent demand: Linking forecasting to inventory obsolescence," European Journal of Operational Research 214(3):606-615 Link: https://www.sciencedirect.com/science/article/abs/pii/S0377221711003985 Retrieved: 2026-05-15
What it is: A modification of Croston that updates a demand probability (probability that a given period is non-zero) every period, instead of updating the inter-arrival interval only on non-zero periods. The key consequence: if demand stops, TSB decays the forecast toward zero, whereas Croston/SBA keep predicting the historical rate indefinitely. Designed for obsolescence (a product is discontinued), but the same dynamic applies to keywords that genuinely die.
When to use: - Keywords that might be in decline / dying — TSB will fade the forecast as the run of zeros grows. - Lumpy series where the demand probability is meaningful (e.g., quarterly events with no demand outside the window). - When you specifically want a forecast that responds to the absence of recent activity, not just to non-zero updates.
Fit for our model:
- ✅ Directly addresses a known failure mode at processing.py:1654-1735 and processing.py:1763 — keywords whose signal is genuinely going to zero. TSB encodes the decline rather than relying on the hardcoded 0.85^h × historical_min floor at processing.py:2100.
- ✅ Same statsforecast API as the other Croston variants; cheap to add to the ensemble at processing.py:1984.
- ⚠ Probability decay is symmetric to growth — TSB will also "wake up" too slowly if demand restarts after a long zero run.
- ⚠ Best paired with a change-point detector so an actual structural change isn't misread as obsolescence.
- 🔧 from statsforecast.models import TSB — see Nixtla docs.
title: Croston Optimized tags: [intermittent, auto-tuned, smoothing] applies_to: [tier_2, tier_3] data_needs: "same as Croston; benefits from ≥24 observations for smoothing-parameter optimization" status: candidate
Croston Optimized¶
Source: Nixtla statsforecast (extension of Croston 1972 / SBA 2005 with parameter optimization)
Link: https://nixtlaverse.nixtla.io/statsforecast/docs/models/crostonoptimized.html
Retrieved: 2026-05-15
What it is: Plain Croston, but with the smoothing parameters α (demand-size SES) and β (inter-arrival SES) chosen by numerical optimization to minimize in-sample MSE, instead of being fixed at 0.1. On longer series this can materially beat textbook Croston/SBA; on very short series the optimization overfits and you should prefer fixed-parameter SBA.
When to use: - Tier 3 sparse keywords with ≥24 months — enough data for the optimizer to find sensible α/β. - When you have audit evidence that the fixed-α=0.1 Croston/SBA is too sluggish or too jumpy on a particular cohort.
Fit for our model:
- ✅ Tier-3 drop-in alongside SBA for sparse keywords at processing.py:1180.
- ⚠ Slower per-keyword than fixed-parameter variants — but still cheap relative to ETS/AutoCES. Profile before adding to the ensemble at processing.py:1984.
- ⚠ Can overfit on Tier 1/2 — gate by history length.
- 🔧 from statsforecast.models import CrostonOptimized. Pair with the classification quadrant below to pick between this and SBA.
title: Demand classification (Syntetos-Boylan-Croston quadrant) tags: [intermittent, classification, method-selection] applies_to: [tier_1, tier_2, tier_3] data_needs: "≥12 observations for stable ADI/CV² estimates; can be approximated with fewer" status: candidate
Demand classification (Syntetos-Boylan-Croston quadrant)¶
Source: Syntetos, Boylan & Croston (2005), "On the categorization of demand patterns," Journal of the Operational Research Society 56(5):495-503 Link: https://www.jstor.org/stable/4101868 Retrieved: 2026-05-15
What it is: A simple two-axis classification for picking an intermittent-demand forecaster. Compute ADI (average demand interval = average number of periods between non-zero observations; ADI=1 means every period is non-zero) and CV² (squared coefficient of variation of the non-zero demand sizes). Cut at ADI=1.32 and CV²=0.49 to get four quadrants:
| Quadrant | ADI | CV² | Recommended method |
|---|---|---|---|
| Smooth | <1.32 | <0.49 | SES / ETS / Holt-Winters |
| Erratic | <1.32 | ≥0.49 | SES with robust loss; or aggregate then ETS |
| Intermittent | ≥1.32 | <0.49 | Croston / SBA |
| Lumpy | ≥1.32 | ≥0.49 | SBA / TSB / IMAPA |
When to use: - As the router in front of the forecasting ensemble — pick a model per keyword based on its ADI/CV² class. - As a diagnostic — keywords in the lumpy quadrant should never feed an unfiltered Holt-Winters fit.
Fit for our model:
- ✅ Drop-in replacement for the binary _is_spiky_series heuristic at processing.py:1180 and processing.py:1310. Instead of one threshold that suppresses, the classification chooses between four buckets each with its own forecaster.
- ✅ ADI and CV² are O(n) to compute — cheap to add per-keyword inside calculate_hybrid_growth() (processing.py:1247).
- ✅ The smooth/erratic distinction adds nothing new for our existing AutoCES + Holt-Winters ensemble at processing.py:1984 — they would handle both. The classification is most useful for the intermittent/lumpy quadrants, which today get the suppression treatment.
- ⚠ Cut-offs (1.32, 0.49) are empirically derived for inventory/retail data; may want to re-tune on our keyword cohorts before going to production.
- 🔧 Pure-NumPy implementation in <10 lines. Use the result to select among CrostonSBA, TSB, IMAPA, and the existing AutoCES/HW ensemble.