{"id":159389,"date":"2010-03-16T00:00:00","date_gmt":"2010-03-16T00:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/msr-research-item\/reverberated-speech-signal-separation-based-on-regularized-subband-feedforward-ica-and-instantaneous-direction-of-arrival\/"},"modified":"2020-06-04T15:02:48","modified_gmt":"2020-06-04T22:02:48","slug":"reverberated-speech-signal-separation-based-on-regularized-subband-feedforward-ica-and-instantaneous-direction-of-arrival","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/reverberated-speech-signal-separation-based-on-regularized-subband-feedforward-ica-and-instantaneous-direction-of-arrival\/","title":{"rendered":"Reverberated Speech Signal Separation Based on Regularized Subband Feedforward ICA and Instantaneous Direction of Arrival"},"content":{"rendered":"<div class=\"asset-content\">\n<p>In this paper, independent component analysis (ICA) in a sub-band domain has been extended into a feed-forward network. The feed-forward network maximizes mutual independence of separated current frames using information from both current and previous multi-channel frames of speech signals captured by a microphone array. To guide into a proper separation prevent-ing permutation and arbitrary scaling, we not only rely on the steered response for the first tap of the demixing filter but also penalize on the direction thus drastically increasing the mean squared error with the spatial filtered output. After conver-gence, by applying instantaneous direction of arrival (IDOA) based post-processing, we can additionally suppress the leakage of interference as well as the reverberated target signal. The signal to interference ratio (SIR) is improved more than 20 dBC for distances up to 2.7 m and angle differences down to 26\u00b0.<\/p>\n<\/div>\n<p><!-- .asset-content --><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this paper, independent component analysis (ICA) in a sub-band domain has been extended into a feed-forward network. The feed-forward network maximizes mutual independence of separated current frames using information from both current and previous multi-channel frames of speech signals captured by a microphone array. To guide into a proper separation prevent-ing permutation and arbitrary [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"IEEE","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"\u00a9 2008 IEEE. Personal use of this material is permitted. However, permission to reprint\/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.http:\/\/www.ieee.org\/","msr_conference_name":"International Conference on Acoustics, Speech and Signal Processing","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2010-3-16","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[243062,13554],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-159389","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-audio-acoustics","msr-research-area-human-computer-interaction","msr-locale-en_us"],"msr_publishername":"IEEE","msr_edition":"","msr_affiliation":"","msr_published_date":"2010-3-16","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"207212","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/KimTashevAcero_ICASSP_2010.pdf","id":"207212","title":"KimTashevAcero_ICASSP_2010.pdf","label_id":"243109","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":207212,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2016\/02\/KimTashevAcero_ICASSP_2010.pdf"}],"msr-author-ordering":[{"type":"text","value":"Lae-Hoon Kim","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Ivan Tashev","user_id":32127,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ivan Tashev"},{"type":"user_nicename","value":"Alex Acero","user_id":30932,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Alex Acero"}],"msr_impact_theme":[],"msr_research_lab":[199565],"msr_event":[],"msr_group":[144923],"msr_project":[488189],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":488189,"post_title":"Sound Capture and Speech Enhancement","post_name":"sound-capture-speech-enhancement","post_type":"msr-project","post_date":"2018-06-12 09:35:37","post_modified":"2022-04-08 12:58:58","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/sound-capture-speech-enhancement\/","post_excerpt":"The goal of device design is to overcome the device, room, and noise effects, ultimately producing a clean audio signal good enough for people and machines to understand.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/488189"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/159389","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/159389\/revisions"}],"predecessor-version":[{"id":540193,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/159389\/revisions\/540193"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=159389"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=159389"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=159389"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=159389"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=159389"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=159389"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=159389"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=159389"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=159389"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=159389"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=159389"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=159389"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=159389"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}