{"id":582376,"date":"2019-04-29T14:39:33","date_gmt":"2019-04-29T21:39:33","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=582376"},"modified":"2019-05-23T16:46:06","modified_gmt":"2019-05-23T23:46:06","slug":"improving-binaural-ambisonics-decoding-by-spherical-harmonics-domain-tapering-and-coloration-compensation","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/improving-binaural-ambisonics-decoding-by-spherical-harmonics-domain-tapering-and-coloration-compensation\/","title":{"rendered":"Improving Binaural Ambisonics Decoding by Spherical Harmonics Domain Tapering and Coloration Compensation"},"content":{"rendered":"<p>A powerful and flexible approach to record or encode a spatial sound scene is through spherical harmonics (SHs), or Ambisonics. An SH-encoded scene can be rendered binaurally by applying SH-encoded head-related transfer functions (HRTFs). Limitations of the recording equipment or computational constraints dictate the spatial reproduction accuracy, thus rendering might suffer from spatial degradation as well as coloration. This paper studies the effect of tapering the SH representation of a binaurally rendered sound field in conjunction with its spectral equalization. The proposed approach is shown to reduce coloration and thus improves perceived audio quality.<\/p>\n<p><strong>Sample code<\/strong> is available <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/chris-hld\/spaudiopy\/blob\/ICASSP2019\/examples\/SH_tapering.py\">here<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.<\/p>\n<p>a)<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/hold5-p5-hold-large.gif\" alt=\"\" width=\"410\" height=\"309\" class=\"alignnone size-full wp-image-582391\" \/><span style=\"padding-left:50px\">b)<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/beam_pattern.png\" alt=\"\" width=\"322\" height=\"309\" class=\"alignnone size-full wp-image-589570\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/beam_pattern.png 780w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/beam_pattern-300x288.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/beam_pattern-768x738.png 768w\" sizes=\"auto, (max-width: 322px) 100vw, 322px\" \/><br \/>\nFigure: a) coloration estimation of third order HRIR reconstruction above 2.5 kHz for impulse responses moving around the listener in ten degree steps in the horizontal plane (\u03b8 = 90\u00b0); b) cross section (\u03b8 = 90 deg) of a spatial dirac pulse magnitude in dB, reconstruction with N = 5 at \u2126 = [90 deg, 90 deg], and with additional Hann tapering function of SH coefficients as in (8).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A powerful and flexible approach to record or encode a spatial sound scene is through spherical harmonics (SHs), or Ambisonics. An SH-encoded scene can be rendered binaurally by applying SH-encoded head-related transfer functions (HRTFs). Limitations of the recording equipment or computational constraints dictate the spatial reproduction accuracy, thus rendering might suffer from spatial degradation as [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"IEEE","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2019-5-1","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[243062],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-582376","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-audio-acoustics","msr-locale-en_us"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2019-5-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"IEEE","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/04\/preprint_ICASSP2019_Ambisonics_Tapering.pdf","id":"585847","title":"preprint_icassp2019_ambisonics_tapering","label_id":"243132","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/ieeexplore.ieee.org\/document\/8683751","label_id":"243109","label":0}],"msr_related_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/github.com\/chris-hld\/spaudiopy\/blob\/ICASSP2019\/examples\/SH_tapering.py","label_id":"243118","label":0}],"msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":585847,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/05\/preprint_ICASSP2019_Ambisonics_Tapering.pdf"},{"id":585844,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/05\/preprint_ICASSP2019_VolumeEstimation.pdf"},{"id":585841,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2019\/05\/preprint_ICASSP2019_MOS_estimation.pdf"}],"msr-author-ordering":[{"type":"text","value":"Christoph Hold","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Hannes Gamper","user_id":31943,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Hannes Gamper"},{"type":"text","value":"Ville Pulkki","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Nikunj Raghuvanshi","user_id":33106,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Nikunj Raghuvanshi"},{"type":"user_nicename","value":"Ivan Tashev","user_id":32127,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Ivan Tashev"}],"msr_impact_theme":[],"msr_research_lab":[199565],"msr_event":[],"msr_group":[144923],"msr_project":[212079],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":212079,"post_title":"Spatial Audio","post_name":"spatial-audio","post_type":"msr-project","post_date":"2015-12-01 18:14:03","post_modified":"2022-01-21 12:44:12","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/spatial-audio\/","post_excerpt":"Spatial audio, also known as 3D stereo sound, is about creating a 3D audio experience by using headphones. Applications of this technology include augmented and virtual reality, listening to music, and watching a movie on a tablet or PC.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/212079"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/582376","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":19,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/582376\/revisions"}],"predecessor-version":[{"id":589621,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/582376\/revisions\/589621"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=582376"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=582376"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=582376"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=582376"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=582376"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=582376"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=582376"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=582376"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=582376"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=582376"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=582376"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=582376"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=582376"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}