Abstract

Sound source localization (SSL) is an essential task in
many applications involving speech capture and enhancement. As
such, speaker localization with microphone arrays has received
significant research attention. Nevertheless, existing SSL algorithms
for small arrays still have two significant limitations: lack
of range resolution, and accuracy degradation with increasing
reverberation. The latter is natural and expected, given that strong
reflections can have amplitudes similar to that of the direct signal,
but different directions of arrival. Therefore, correctly modeling
the room and compensating for the reflections should reduce
the degradation due to reverberation. In this paper, we show
a stronger result. If modeled correctly, early reflections can be
used to provide more information about the source location than
would have been available in an anechoic scenario. The modeling
not only compensates for the reverberation, but also significantly
increases resolution for range and elevation. Thus, we show that
under certain conditions and limitations, reverberation can be
used to improve SSL performance. Prior attempts to compensate
for reverberation tried to model the room impulse response (RIR).
However, RIRs change quickly with speaker position, and are
nearly impossible to track accurately. Instead, we build a 3-D
model of the room, which we use to predict early reflections, which
are then incorporated into the SSL estimation. Simulation results
with real and synthetic data show that even a simplistic room
model is sufficient to produce significant improvements in range
and elevation estimation, tasks which would be very difficult when
relying only on direct path signal components.

‚Äč