Summary: | This thesis concerns several aspects of complex-valued instantaneous mixing matrix estimation. While the main application of the work is to extract individual speech signal from group-based conversations in real-world environments, the methods developed in this thesis can be applied to other types of complex-valued instantaneous mixtures. The thesis includes several main contributions: improvement of estimation performance of sparse component analysis in under-determined case where the number of sensors is less than the number of sources, the connection between sparse component analysis and independent component analysis in determined case, and improvement of estimation performance of two independent component analysis methods in determined case where the number of sensors is equal to the number of sources.
For under-determined blind source separation, mixture models of directional statistics are often employed to estimate the mixing matrix. These mixture models commonly adopt the Gaussian function to model the distribution of the observed data about a specific direction. Since the Gaussian function may not be well-suited for sparse signals, the performance of mixture models of directional statistics is limited. Furthermore, solving mixture models of directional statistics is computationally costly due to the computation of spatial covariance matrices. These issues motivated the development of directional sparse filtering (DSF) for complex-valued mixing matrix estimation. Using power mean of the magnitude-squared cosine distances, directional sparse filtering searches for a set of directions such that each data sample is expressed by only a few vectors of the set. Simulations on synthetic data and real mixtures of recorded speeches indicate that the proposed directional sparse filtering outperforms directional clustering baselines while having lower computational complexity.
Next, a stability condition for invertible mixing matrix estimation using multi-variate non-separable contrast functions is derived. The stability condition is verified via Monte-Carlo simulation for several directional clustering methods and directional sparse filtering algorithm, given that the sources are super-Gaussian. The stability condition implies that the minima of the cost functions corresponding to these sparse component analysis methods include the mixing matrix in a similar manner to that of independent component analysis. This result shows that sparse component analysis methods based on directional clustering can be interpreted as independent component analysis using quasi-maximum likelihood estimation.
For determined blind source separation, two re-parameterizations are proposed to convert the constrained independent component analysis by entropy bound minimization into unconstrained optimization problems. After that, these unconstrained optimization problems are solved via quasi-Newton methods. Experiment results show that the proposed method yields higher estimation performance than the baseline algorithm, particularly when the source distributions are close to Gaussian distribution. The re-parameterizations for independent component analysis by entropy bound minimization are further extended to estimate the mixing matrix and the temporal whitening filters simultaneously. This results in a new algorithm for independent component analysis by entropy rate bound minimization. Compared to the conventional independent component analysis by entropy rate bound minimization, the new method shows improvement in terms of mixing matrix estimation and speech separation.
|