{"id":803650,"date":"2021-12-09T15:14:38","date_gmt":"2021-12-09T23:14:38","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&#038;p=803650"},"modified":"2021-12-09T15:18:54","modified_gmt":"2021-12-09T23:18:54","slug":"audio-based-spam-call-detection","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/audio-based-spam-call-detection\/","title":{"rendered":"Audio-Based SPAM Call Detection"},"content":{"rendered":"<p>Spam communications are organized attempts of falsified claims with the purpose of marketing, spreading false information and deceiving the end recipient. Phone spam is an international nuisance, with the US among the most spammed countries in the world in 2020. Besides the agitating nature of these calls, criminal scams are defrauding subscribers of billions of dollars every year. Therefore, it is necessary to develop automated systems for the identification of spam calls to minimize fraud and reduce the displeasure of receiving them. The call origin, call duration and other Call Detail Records can be used to assess whether a call is fraudulent or not, but the actual audio content is overlooked. This work focuses on extracting acoustic features from voicemail recordings containing speech, which are used to train Machine Learning models that identify spam calls. Both local and global feature descriptors are used, including Mel-Frequency Cepstral Coefficients and Log-Mel Spectrum, and their efficacy for distinguishing spam from non-spam calls is explored. We demonstrate that a spam voice call can be detected while relying only on the acoustic information of the call. A further analysis of the temporal and spectral features that are most informative for the task is also presented.<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-803689 aligncenter\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown2-300x195.jpg\" alt=\"dataset breakdown for spam calls voicemails\" width=\"755\" height=\"490\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown2-300x195.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown2-1024x666.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown2-240x156.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown2.jpg 1031w\" sizes=\"auto, (max-width: 755px) 100vw, 755px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-803683 aligncenter\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown3-300x182.jpg\" alt=\"dataset breakdown for spam calls voicemails\" width=\"767\" height=\"466\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown3-300x182.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown3-768x467.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown3-240x146.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_breakdown3.jpg 943w\" sizes=\"auto, (max-width: 767px) 100vw, 767px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-803716 aligncenter\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_Human-vs-Robocall-300x86.jpg\" alt=\"ASA_meeting_spam_2021_Human vs Robocall\" width=\"720\" height=\"206\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_Human-vs-Robocall-300x86.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_Human-vs-Robocall-1024x293.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_Human-vs-Robocall-768x220.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_Human-vs-Robocall-240x69.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_Human-vs-Robocall.jpg 1039w\" sizes=\"auto, (max-width: 720px) 100vw, 720px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-803704 aligncenter\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_correctRateSpam-300x158.jpg\" alt=\"chart, line chart\" width=\"693\" height=\"365\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_correctRateSpam-300x158.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_correctRateSpam-1024x540.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_correctRateSpam-768x405.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_correctRateSpam-240x127.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_meeting_spam_2021_correctRateSpam.jpg 1408w\" sizes=\"auto, (max-width: 693px) 100vw, 693px\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Spam communications are organized attempts of falsified claims with the purpose of marketing, spreading false information and deceiving the end recipient. Phone spam is an international nuisance, with the US among the most spammed countries in the world in 2020. Besides the agitating nature of these calls, criminal scams are defrauding subscribers of billions of [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"4","msr_journal":"The Journal of the Acoustical Society of America","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"A357","msr_page_range_end":"A357","msr_series":"","msr_volume":"150","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2021-12-1","msr_highlight_text":"","msr_notes":"Acoustical Society of America","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[243062],"msr-publication-type":[193715],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[249859,247741],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-803650","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-audio-acoustics","msr-locale-en_us","msr-field-of-study-acoustics","msr-field-of-study-audio-signal-processing"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2021-12-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"The Journal of the Acoustical Society of America","msr_volume":"150","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"4","msr_organization":"","msr_how_published":"","msr_notes":"Acoustical Society of America","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_Meeting_Spam_Detection_in_Voicemails_2021.pdf","id":"803668","title":"asa_meeting_spam_detection_in_voicemails_2021","label_id":"243109","label":0},{"type":"doi","viewUrl":"false","id":"false","title":"https:\/\/asa.scitation.org\/doi\/abs\/10.1121\/10.0008583","label_id":"243106","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[{"id":803668,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2021\/12\/ASA_Meeting_Spam_Detection_in_Voicemails_2021.pdf"}],"msr-author-ordering":[{"type":"text","value":"Benjamin M Elizalde","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Dimitra Emmanouilidou","user_id":37461,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Dimitra Emmanouilidou"}],"msr_impact_theme":[],"msr_research_lab":[199565],"msr_event":[],"msr_group":[144923],"msr_project":[559086],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"article","related_content":{"projects":[{"ID":559086,"post_title":"Audio Analytics","post_name":"audio-analytics","post_type":"msr-project","post_date":"2019-02-08 15:57:54","post_modified":"2023-01-13 13:28:08","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/audio-analytics\/","post_excerpt":"Audio analytics is about analyzing and understanding audio signals captured by digital devices, with numerous applications in enterprise, healthcare, productivity, and smart cities.","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/559086"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/803650","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/803650\/revisions"}],"predecessor-version":[{"id":803740,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/803650\/revisions\/803740"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=803650"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=803650"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=803650"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=803650"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=803650"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=803650"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=803650"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=803650"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=803650"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=803650"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=803650"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=803650"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=803650"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}