May 6 - Poster session n. 1
Microphone Array Signal Processing
(chaired by Walter Kellermann - Univ. of Erlangen-Nuremberg, Germany)

1) Gerhard Doblinger
"Beamforming with Optimized Interpolated Microphone Arrays"
2) Fabian Kuech, Markus Kallinger, Richard Schultz-Amling, Giovanni Del Galdo, Jukka Ahonen and Ville Pulkki
"Directional Audio Coding using Planar Microphone Arrays"
3) Anthony Lombard, Walter Kellermann and Herbert Buchner
"A Real-time Demonstrator for the 2D Localization of Two Sound Sources using Blind Adaptive MIMO System Identification"
4) Markus Kallinger, Fabian Kuech, Richard Schultz-Amling, Giovanni Del Galdo, Jukka Ahonen and Ville Pulkki
"Enhanced Direction Estimation using Microphone Arrays for Directional Audio Coding"
5) Yusuke Hioka, Kazunori Kobayashi, Ken'ichi Furuya and Akitoshi Kataoka
"Enhancement of Sounds in a Specific Directional Area using Power Spectra estimated from Multiple Beamforming Outputs
6) Tobias Wolff and Markus Buck
"Spatial Maximum A Posteriori Post-Filtering for Arbitrary Beamforming"
7) Heping Ding, Yijing Chu and Xiaojun Qiu
"Voice Separation using Ratchet FAP Algorithm"
8) Tao Yu and John Hansen
"Robust Auto-focusing Wideband Bayesian Beamforming"
9) Tetsuya Takiguchi, Ryoichi Takashima and Yasuo Ariki
"Active Microphone with Parabolic Reflection Board for Estimation of Sound Source Direction"
10) Alessio Brutti, Maurizio Omologo and Piergiorgio Svaizer
"Comparison between Different Sound Source Localization Techniques based on a Real Data Collection"
11) Seungil Kim, Gun-Ho Song, Hyejeong Jeon and Lag-Yong Kim
"Maximum Likelihood Detector of Reliable Direction-of-Arrival Estimate"
12) Maurice Fallon and Simon Godsill
"Multi Target Acoustic Source Tracking with an Unknown and Time Varying Number of Targets"
13) Albenzio Cirillo, Raffaele Parisi and Aurelio Uncini
"Prefiltering Techniques on Consistent Peak Selection for Talker Position Estimation in Reverberant Rooms"
14) Marian Kepesi, Lukas Ottowitz and Tania Habib
"Joint Position-Pitch Estimation for Multiple Speaker Scenarios"
15) Bowon Lee, Amir Said, Ton Kalker and Ronald Schafer
"Maximum Likelihood Time Delay Estimation with Phase Domain Analysis in the Generalized Cross Correlation Framework"


Poster session n. 1 - abstracts


Gerhard Doblinger "Beamforming with Optimized Interpolated Microphone Arrays"
We present an optimization procedure for wideband beamforming with interpolated arrays. We intend to design a beamformer with a compact size. In addition, we want to reduce the number of sensors while maintaining a good beamforming performance. Our beamformers are implemented using FFT filterbanks. Performance is tested under far-field conditions and under sound propagation with simulated room impulse responses. In addition, we study the influence of sensor noise on the beamforming behavior.

>> Go up

Markus Kallinger, Fabian Kuech, Richard Schultz-Amling, Giovanni Del Galdo, Ville Pulkki and Jukka Ahonen "Directional Audio Coding using Planar Microphone Arrays"
Multichannel sound systems become more and more established in modern audio applications. Consequently, the recording and the reproduction of spatial audio gains increasing attention. Directional Audio Coding (DirAC) represents an ef_cient approach to analyze spatial sound and to reproduce it using arbitrary loudspeaker configurations. In DirAC, the direction-of-arrival and the diffuseness of sound within frequency subbands is used to encode the spatial properties of the observed sound field. The estimation of these parameters is based on an energetic sound field analysis using threedimensional microphone arrays. In practice, however, physical design constraints make three- dimensional microphone configurations often not acceptable. In this paper, we consider a new approach to microphone array processing that allows for an estimation of both direction-of-arrival of sound and diffuseness based on planar microphone configurations. The performance of the proposed method is evaluated via simulations and real measured data.

>> Go up

Anthony Lombard, Walter Kellermann and Herbert Buchner "A Real-time Demonstrator for the 2D Localization of Two Sound Sources using Blind Adaptive MIMO System Identification"
A real-time demonstrator for the 2D localization of two sound sources using two microphone pairs is presented and evaluated. The scheme relies on Blind Source Separation (BSS) to adaptively identify the acoustical MIMO system, hence allowing the estimation of relative time delays for each source and each dimension. Extending our previously presented work [1], a mechanism to solve a pairing problem occuring in the multidimensional localization of several sources is described. It exploits the inherent signal extraction abilities of BSS. Experimental evaluations with large microphone apertures show that the demonstrator can accurately localize two speech sources in a 2D space, with a precision better than one degree.

>> Go up

Markus Kallinger, Fabian Kuech, Richard Schultz-Amling, Giovanni Del Galdo, Ville Pulkki and Jukka Ahonen "Enhanced Direction Estimation using Microphone Arrays for Directional Audio Coding"
Modern home entertainment systems offer surround sound audio playback. This progress over known mono and stereo devices is also intended for high quality hands-free telephony to enhance intelligibility of speech in group conversation. Directional Audio Coding (DirAC) provides an efficient and well-established way to record and encode spatial sound and to render it at an arbitrary loudspeaker setup. On the recording site, DirAC is based on B-format microphone signals. These signals can be obtained by one omnidirectional and three figure-of-eight microphones pointing along the axes of a three-dimensional Cartesian coordinate system. However, a grid of omnidirectional microphones is more appropriate for consumer applications due to economic reasons. Arrays can provide the required figure-of-eight directionality only for a certain frequency range. However, in this contribution we show that a straightforward direction estimator is biased. After formulating the bias analytically we propose an unbiased estimator and derive the theoretical limits for unique direction estimation. The results are illustrated bymeans of simulations and measurements.

>> Go up

Yusuke Hioka, Kazunori Kobayashi, Ken'ichi Furuya and Akitoshi Kataoka "Enhancement of Sounds in a Specific Directional Area using Power Spectra estimated from Multiple Beamforming Outputs
In this paper, a method for picking up sounds located in a particular range of angles is proposed. The structure of the method is based on beamforming with postfiltering. The main part our proposal is introducing a scheme to estimate the power spectra of both desired signals and noises, which are used to derive the Wiener postfilter. From the results of some experiments in a reverberant chamber, we have confirmed that the proposed method succeeded in suppressing more than 10dB of the noise level even in a practical situation.

>> Go up

Tobias Wolff and Markus Buck "Spatial Maximum A Posteriori Post-Filtering for Arbitrary Beamforming"
We present a new approach for residual transient noise suppression at the output of an arbitrary beamformer. A spatial optimum estimate for the instantaneous a posteriori SNR is derived on the basis of the output signals of a blocking matrix. The optimization problem is formulated in the logarithmic domain and statistical models for the obtained quantities are given. Based on these models the optimization problem is solved in the maximum a posteriori sense. It is shown that the performance of speech recognition systems in nonstationary noise scenarios is improved considerably compared to the performance achieved with aWiener filter applied to the beamformer output.

>> Go up

Heping Ding, Yijing Chu and Xiaojun Qiu "Voice Separation using Ratchet FAP Algorithm"
This paper shows how the Ratchet fast affine projection (FAP) adaptation algorithm is used in place of a much more sophisticated algorithm in a recently published voice separation scheme. This reduces the system complexity to be comparable to that of a one using the NLMS adaptive filters. Simulations with real room recordings show that the scheme based on Ratchet FAP converges fast, provides a good separation of the sources, and is quite immune to the ambient noise. Audio demos will be given when this paper is presented.

>> Go up

Tao Yu and John Hansen "Robust Auto-focusing Wideband Bayesian Beamforming"
The problem of uncertian direction-of-arrival (DOA) for narrowband souces has been addressed using adaptive Bayesian beamforming[2,3]. In this study, we present a wideband Bayesian beamforming technique based on the coherent signal-subpace transform (CSST). CSST focuses the wideband data onto a single narrowband to allow for a narrowband Bayesian beamformer, which in turn provides the data-driven DOA information needed to update the key part of CSST?the focusing matrix. Numerical simulations with array data show that the proposed beamformer is robust to both DOA mismatch and coherent wideband interfences.

>> Go up

Tetsuya Takiguchi, Ryoichi Takashima and Yasuo Ariki "Active Microphone with Parabolic Reflection Board for Estimation of Sound Source Direction"
This paper introduces a concept of an active microphone that achieves a good combination of active-operation and signal processing, where a new sound-source-direction estimation method using only a single microphone with a parabolic re- flection board is proposed. In our previous work [1], we proposed GMM (Gaussian Mixture Model) separation for estimation of the sound source direction, where the observed (reverberant) speech is separated into the acoustic transfer function and the clean speech GMM. However, the previous method required the measurement of speech for each room environment in advance. The new proposed method using parabolic reflection is able to estimate the sound source direction without any prior measurements. Its effectiveness is confirmed by sound-source-direction estimation experiments on white noise in a room environment.

>> Go up

Alessio Brutti, Maurizio Omologo and Piergiorgio Svaizer "Comparison between Different Sound Source Localization Techniques based on a Real Data Collection"
Comparing the different sound source localization techniques, proposed in the literature during the last decade, represents a relevant topic in order to establish advantages and disadvantages of a given approach in a real-time implementation. Traditionally, algorithms for sound source localization rely on an estimation of Time Difference of Arrival (TDOA) at microphone pairs through GCC-PHAT. When several microphone pairs are available the source position can be estimated as the point in space that best fits the set of TDOA measurements by applying Global Coherence Field (GCF), also known as SRP-PHAT, or Oriented Global Coherence Field (OGCF). A first interesting analysis compares the performance of GCF and OGCF to a suboptimal LS search method. In a second step, Adaptive Eigenvalue Decomposition is implemented as an alternative to GCC-PHAT in TDOA estimation. Comparative experiments are conducted on signals acquired by a linear array during WOZ experiments in an interactive-TV scenario. Changes in performance according to different SNR levels are reported.

>> Go up

Seungil Kim, Gun-Ho Song, Hyejeong Jeon and Lag-Yong Kim "Maximum Likelihood Detector of Reliable Direction-of-Arrival Estimate"
In this paper, we propose a maximum likelihood detector for reliable sound source localization system. It is based on making a measure of reliability of estimation results. The reliability can be reduced from waterbed effect of source localization algorithm. If the calculated reliability measure has a lower value than a predefined threshold, the estimated direction-of-arrival (DOA) is regarded as a wrong result and subsequently discarded. We determine the threshold for reliable estimate selection using maximum likelihood rule. Some experiments show that the proposed method can reject perturbed results of the estimated DOA.

>> Go up

Maurice Fallon and Simon Godsill "Multi Target Acoustic Source Tracking with an Unknown and Time Varying Number of Targets"
Particle Filter-based Acoustic Source Tracking algorithms track (online and in real-time) the position of a sound source - a person speaking in a room - based on the current data from a distributed microphone array as well as all previous data up to that point. This paper develops a previously introduced multi-target (MTT) methodology to allow for an unknown and time-varying number of speakers. Finally examples show typical tracking performance in a number of different scenarios with simultaneously active speech sources.

>> Go up

Albenzio Cirillo, Raffaele Parisi and Aurelio Uncini "Prefiltering Techniques on Consistent Peak Selection for Talker Position Estimation in Reverberant Rooms"
Hands-free communication systems need to isolate a single talker contribution in a distant talking situation and in the presence of environmental noises or other acoustic events. Sound source localization is an essential part of this process. However, localization methods that are actually based on time delay estimation (TDE) yield worse performances as reverberation time increases. As the majority of the technologies adopting hands-free systems are applied into bounded environment, a strategy to limit the effect of reverberation is due. The Optimal Line Selection algorithm demonstrated to perform well in case of localization in different reverberant conditions. In this paper the behavior of this estimator will be examined in case of cepstral prefiltering.

>> Go up

Marian Kepesi, Lukas Ottowitz and Tania Habib "Joint Position-Pitch Estimation for Multiple Speaker Scenarios"
This paper presents an enhancement of a recently proposed method of joint pitch and direction of arrival extraction for speaker localization. We propose a new pre-processing module inspired by auditory system in order to estimate the multiple pitch and position values for multi-speaker scenarios. A detailed analysis of the module leads to the conclusion that the addition of the module in Position-Pitch estimation (PoPi) algorithm gives an intuitive representation of all active speakers in terms of their respective position and pitch estimates. The proposed method is tested on real-world recordings. The results show the power of the method in determining the accurate pitch and position estimates for multi-speaker scenarios.

>> Go up

Bowon Lee, Amir Said, Ton Kalker and Ronald Schafer "Maximum Likelihood Time Delay Estimation with Phase Domain Analysis in the Generalized Cross Correlation Framework"
We propose a new method for efficiently estimating the maximum likelihood frequency weighting in the generalized cross correlation framework for time delay estimation. The estimation is based on the analysis of the cross spectrum between a pair of microphones. We model how phase distribution is affected by both noise and reverberation, and relax the common assumption that noise and reverberation are uncorrelated with the source. Thus, our method does not require knowledge of the noise spectrum or a detailed model of the reverberation. Experimental results show that the proposed method is superior to the PHAT method.

>> Go up