Sound Source Localization: A Survey


Tombul Çalışkan H., Karacan H.

International Journal on Advanced Technology, Engineering, and Information System (IJATEIS), cilt.4, sa.2, ss.295-314, 2025 (Hakemli Dergi)

Özet

In modern defense systems, there is a growing demand for technologies that detect and track threats without revealing the observer’s position. Sound Source Localization (SSL) fulfills this requirement by passively estimating the position of sound-emitting targets using spatially distributed microphone arrays. Unlike active sensing systems, SSL operates solely on incoming acoustic signals, extracting location information from time delays, amplitude differences, or phase shifts. This survey provides a structured review of recent studies, covering both classical SSL methods (e.g., TDOA, GCC-PHAT) and artificial intelligence (AI)-based models (e.g., CNNs, RNNs). Classical techniques offer low computational complexity and reliable spatial resolution under ideal conditions but often degrade in noisy or reverberant environments. In contrast, AI-based approaches exhibit higher adaptability and robustness to environmental variability, though they require substantial labeled data and computational resources. Moreover, the performance of SSL systems is closely tied to microphone array geometry: while linear arrays are simple and widely used, circular, spherical, and irregular configurations provide better angular coverage and enable 3D localization. The review concludes that SSL performance is highly application-dependent, and no single method is universally superior. Hybrid approaches that combine signal processing with machine learning, as well as adaptive array designs, emerge as promising directions for improving SSL accuracy, robustness, and scalability in real-world scenarios. The comparative analysis result also underscores that optimal SSL design hinges on a trade-off between algorithmic complexity, environmental conditions, and array geometry, with hybrid methods offering a viable path forward.