Primacy of feature engineering over architectural complexity for intermittent demand forecasting


Creative Commons License

Nathan B. S., Aravinth P., Reddy B. V. S., Sastry C. C., SALUNKHE S. S., Cep R.

Scientific Reports, cilt.16, sa.1, 2026 (SCI-Expanded, Scopus) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 16 Sayı: 1
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1038/s41598-026-35197-y
  • Dergi Adı: Scientific Reports
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, Chemical Abstracts Core, MEDLINE, Directory of Open Access Journals
  • Anahtar Kelimeler: Feature engineering, Intermittent demand forecasting, Machine learning for supply chains, Sparse time series, statistical-ML hybrid models
  • Gazi Üniversitesi Adresli: Evet

Özet

Intermittent demand forecasting remains a fundamental challenge in large-scale supply chains due to extreme demand sparsity, irregular occurrence patterns, and highly variable demand magnitudes. While recent studies have increasingly adopted complex multi-stage model architectures to address these challenges, the role of statistically grounded feature engineering has received comparatively less attention. This study proposes the Smoothed Hybrid Occurrence-Size (SHOS) framework, which generates adaptive, series-specific estimates of demand occurrence probability and conditional demand size using sparsity-aware exponential smoothing. These estimates are incorporated as features into supervised machine learning models trained on large-scale, zero-padded panel data. The proposed approach is evaluated on an automotive aftermarket dataset comprising approximately 1.4 million monthly observations across 56,000 spare-part time series, using an 11-fold rolling-window cross-validation protocol. Empirical results demonstrate that SHOS-enhanced models achieve substantial performance improvements over baseline feature sets, reducing mean absolute error (MAE) by approximately 50% and weighted mean absolute percentage error (WMAPE) by over 40% in highly intermittent demand segments. Notably, despite their increased architectural complexity, two-stage hurdle-based models do not outperform the proposed single-stage SHOS-enhanced framework. Formal statistical testing using the Wilcoxon signed-rank test confirms that the performance advantage of the single-stage SHOS model is consistent and statistically significant across all validation folds (p < 0.001). These findings reveal an unexpected but practically important insight: robust, statistically informed feature engineering can be more effective than increased model complexity for intermittent demand forecasting. The results highlight the value of simple, interpretable, and computationally efficient forecasting frameworks for large-scale operational deployment, while motivating future validation across additional application domains.