ICASSP 2007 - April 15-20, 2007 - Honolulu, Hawai'i, U.S.A.

TUT-11: Audio Source Separation based on Independent Component Analysis

Monday Afternoon, April 16
14:00 - 17:00
Room 324

Presented by

Shoji Makino and Hiroshi Sawada, NTT, Japan

Abstract

This tutorial describes a state-of-the-art method for the blind source separation (BSS) of convolutive mixtures of audio signals. Independent component analysis (ICA) is used as a major statistical tool for separating the mixtures. We provide examples to show how ICA criteria change as the number of audio sources increases. We then discuss a frequency-domain approach where simple instantaneous ICA is employed in each frequency bin. A directivity pattern analysis of the ICA solutions provides us with a physical interpretation of the ICA-based separation. It tells us the relationship between ICA-based BSS and adaptive beamforming. In order to obtain properly separated signals with the frequency-domain approach, the permutation and scaling ambiguity of the ICA solutions should be aligned appropriately. We describe two complementary methods for aligning the permutations, i.e., collecting separated frequency components originating from the same source. The first method exploits the signal envelope dependence of the same source across frequencies. The second method relies on the spatial diversity of the sources, and is closely related to source localization techniques. Finally, we describe methods for sparse source separation, which can be applied even to an underdetermined case. The tutorial will end with a live demonstration of BSS in a real room situation.

  1. Introduction
  2. Convolutive blind source separation (BSS) - Formulation
  3. Independent component analysis - Concepts
  4. Frequency-domain approach for convolutive mixtures
  5. Relationship between BSS and adaptive beamformer - Physical interpretation
  6. (Coffee break)
  7. Permutation and scaling problems
  8. Dependence on separated signals across frequencies
  9. Time-difference-of-arrival (TDOA) and direction-of-arrival (DOA) estimation
  10. Sparse source separation

Requirements or Prerequisites

Basic tutorial + Advanced tutorial

Speaker Biographies

Shoji Makino received the B. E., M. E., and Ph. D. degrees from Tohoku University, Japan, in 1979, 1981, and 1993, respectively. He is an Executive Manager with NTT Communication Science Laboratories. He is also a Guest Professor at Hokkaido University. His research interests include the blind source separation of convolutive mixtures of speech, adaptive filtering technologies and the realization of acoustic echo cancellation. He is the author or co-author of more than 200 articles in journals and conference proceedings, and he has been responsible for more than 150 patents. He received the ICA Unsupervised Learning Pioneer Award in 2006. He is a member of both the Awards Board and Conference Board of the IEEE SP Society. He is an Associate Editor of IEEE Transactions on Speech and Audio Processing and an Associate Editor of the EURASIP Journal on Applied Signal Processing. He is a member of the Technical Committee on Audio and Electroacoustics of the IEEE SP Society as well as the Technical Committee on Blind Signal Processing of the IEEE CAS Society. In addition, he is the General Chair of the WASPAA 2007 in Mohonk, the Organizing Chair of ICA2003 in Nara, the General Chair of IWAENC2003 in Kyoto. He is an IEEE Fellow, a council member of the ASJ, and the Chair of the Technical Committee on Engineering Acoustics of the IEICE.

Hiroshi Sawada received the B.E., M.E. and Ph.D. degrees in information science from Kyoto University, Kyoto, Japan, in 1991, 1993 and 2001, respectively. In 1993, he joined NTT Communication Science Laboratories, where he is now a senior research scientist. Since 2000, he has been engaged in research on signal processing, microphone arrays, and blind source separation (BSS). More specifically, he is working on frequency-domain BSS for acoustic convolutive mixtures using independent component analysis (ICA). He is an Associate Editor of the IEEE Transactions on Audio, Speech and Language Processing, a member of the Technical Committee on Audio and Electroacoustics of the IEEE SP Society, and a member of the Technical Committee on Blind Signal Processing of the IEEE CAS Society. He is the Publications Chair of WASPAA 2007, the Communications Chair of IWAENC2003, an organizing committee member of ICA2003. He is a senior member of the IEEE, and a member of the IEICE, and the ASJ. He is the author or co-author of three book chapters, 20 journal articles, and more than 70 conference papers. He received the 9th TELECOM System Technology Award for Students from the Telecommunications Advancement Foundation in 1994, and the Best Paper Award of the IEEE Circuit and System Society in 2000. He has four years of teaching experience as a part-time lecturer at Doshisha University, Kyoto, Japan.


©2012 Conference Management Services, Inc. -||- email: webmaster@icassp2007.com -||- Last updated Wednesday, April 04, 2007