Real-Time Target Classification and Kinematic Estimation from High-Frequency SPAD Sensor Data Using Transformation-Based Models: A Simulation-Based Proof-of-Concept


Cakir E., AYTURAN K., KUTBAY U.

APPLIED SCIENCES-BASEL, cilt.16, sa.10, 2026 (SCI-Expanded, Scopus) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 16 Sayı: 10
  • Basım Tarihi: 2026
  • Doi Numarası: 10.3390/app16104975
  • Dergi Adı: APPLIED SCIENCES-BASEL
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Applied Science & Technology Source, Compendex, INSPEC, Directory of Open Access Journals
  • Gazi Üniversitesi Adresli: Evet

Özet

Real-time tracking of high-speed targets in autonomous systems requires detection and decision-making pipelines that can operate within sub-millisecond time budgets. Single Photon Avalanche Diode (SPAD) sensors are well suited for this task, offering 10 kHz Time-of-Flight (ToF) measurements with picosecond timing precision. However, processing such high-frequency time-series data with conventional deep learning models introduces computational bottlenecks that are difficult to handle on resource-constrained embedded hardware. This paper presents an ultra-lightweight, dual-head architecture built on the MiniRocket transformation algorithm, where a single shared feature extractor simultaneously feeds two independent decision pathways: one for multi-class target classification and one for 3-parameter kinematic regression covering velocity, pitch, and yaw. As a single-pixel sensor, the device provides only 1D range information; lateral 3D spatial localization is outside the scope of this work. To the best of the authors' knowledge, this is the first application of MiniRocket to continuous kinematic estimation from high-frequency sensor data. Since collecting labeled physical flight data at these speeds is largely infeasible, a physics-based ray-casting simulation was developed to generate a 55,440-sample dataset across four 3D CAD target models under varying speed (100-450 m/s), orientation, and noise conditions. The proposed architecture achieves 98.6% classification accuracy and a velocity Mean Absolute Error (MAE) of 0.26 m/s, with orientation estimation yielding a pitch MAE of 3.47 degrees and a yaw MAE of 2.46 degrees-values consistent across all five cross-validation folds, indicating that the orientation performance floor is governed by the sensor's physical angular resolution rather than by model capacity. With approximately 27,000 trainable parameters, the system completes full dual-task inference in 0.56 ms on a 16-core CPU (1785 Frames Per Second-FPS), satisfying the 1 ms real-time constraint of a 10 kHz sensor without GPU acceleration. It should be noted that the single-pixel SPAD architecture provides only 1D range-along-beam information; full 3D spatial localization is physically not extractable from a single sensor and is not addressed in this study.