{"id":925464,"date":"2023-03-07T16:30:26","date_gmt":"2023-03-08T00:30:26","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/"},"modified":"2025-07-01T06:36:31","modified_gmt":"2025-07-01T13:36:31","slug":"hard-meta-dataset-towards-understanding-few-shot-performance-on-difficult-tasks","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/hard-meta-dataset-towards-understanding-few-shot-performance-on-difficult-tasks\/","title":{"rendered":"Hard-Meta-Dataset++: Towards Understanding Few-Shot Performance on Difficult Tasks"},"content":{"rendered":"<p><span dir=\"ltr\" role=\"presentation\">Few-shot classification is the ability to adapt to any new classification task from <\/span><span dir=\"ltr\" role=\"presentation\">only a few training examples. The performance of current top-performing few-<\/span><span dir=\"ltr\" role=\"presentation\">shot classifiers varies widely across different tasks where they often fail on a sub-<\/span><span dir=\"ltr\" role=\"presentation\">set of \u2018difficult\u2019 tasks.<\/span> <span dir=\"ltr\" role=\"presentation\">This phenomenon has real-world consequences for de<\/span><span dir=\"ltr\" role=\"presentation\">ployed few-shot systems where safety and reliability are paramount, yet little has <\/span><span dir=\"ltr\" role=\"presentation\">been done to understand these failure cases. In this paper, we study these difficult <\/span><span dir=\"ltr\" role=\"presentation\">tasks to gain a more nuanced understanding of the limitations of current meth<\/span><span dir=\"ltr\" role=\"presentation\">ods. To this end, we develop a general and computationally efficient algorithm <\/span><span dir=\"ltr\" role=\"presentation\">called FastDiffSel <\/span><span dir=\"ltr\" role=\"presentation\">to extract difficult tasks from any large-scale vision dataset. <\/span><span dir=\"ltr\" role=\"presentation\">Notably, our algorithm can extract tasks at least 20x faster than existing methods <\/span><span dir=\"ltr\" role=\"presentation\">enabling its use on large-scale datasets. We use FastDiffSel <\/span><span dir=\"ltr\" role=\"presentation\">to extract difficult <\/span><span dir=\"ltr\" role=\"presentation\">tasks from<\/span> <span dir=\"ltr\" role=\"presentation\">Meta-Dataset<\/span><span dir=\"ltr\" role=\"presentation\">, a widely-used few-shot classification benchmark, and <\/span><span dir=\"ltr\" role=\"presentation\">other challenging large-scale vision datasets including<\/span> <span dir=\"ltr\" role=\"presentation\">ORBIT<\/span><span dir=\"ltr\" role=\"presentation\">,<\/span> <span dir=\"ltr\" role=\"presentation\">CURE<\/span><span dir=\"ltr\" role=\"presentation\">&#8211;<\/span><span dir=\"ltr\" role=\"presentation\">OR<\/span> <span dir=\"ltr\" role=\"presentation\">and<\/span> ObjectNet<span dir=\"ltr\" role=\"presentation\">.<\/span> <span dir=\"ltr\" role=\"presentation\">These tasks are curated into Hard-Meta-Dataset<\/span><span dir=\"ltr\" role=\"presentation\">++, a new few-shot classification <\/span><span dir=\"ltr\" role=\"presentation\">testing benchmark to promote the development of methods that are robust to <\/span><span dir=\"ltr\" role=\"presentation\">even the most difficult tasks. We use<\/span> <span dir=\"ltr\" role=\"presentation\">Hard-Meta-Dataset<\/span><span dir=\"ltr\" role=\"presentation\">++ to stress-test an <\/span><span dir=\"ltr\" role=\"presentation\">extensive suite of few-shot classification methods and show that state-of-the-art <\/span><span dir=\"ltr\" role=\"presentation\">approaches fail catastrophically on difficult tasks. We believe that our extraction <\/span><span dir=\"ltr\" role=\"presentation\">algorithm FastDiffSel <\/span><span dir=\"ltr\" role=\"presentation\">and<\/span> Hard-Meta-Dataset<span dir=\"ltr\" role=\"presentation\">++ will aid researchers in <\/span><span dir=\"ltr\" role=\"presentation\">further understanding failure modes of few-shot classification models.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Few-shot classification is the ability to adapt to any new classification task from only a few training examples. The performance of current top-performing few-shot classifiers varies widely across different tasks where they often fail on a sub-set of \u2018difficult\u2019 tasks. This phenomenon has real-world consequences for deployed few-shot systems where safety and reliability are paramount, [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"","msr_journal":"","msr_number":"","msr_organization":"IEEE","msr_pages_string":"","msr_page_range_start":"","msr_page_range_end":"","msr_series":"","msr_volume":"","msr_copyright":"","msr_conference_name":"","msr_doi":"","msr_arxiv_id":"","msr_s2_paper_id":"","msr_mag_id":"","msr_pubmed_id":"","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_original_fields_of_study":"","msr_release_tracker_id":"","msr_s2_match_type":"","msr_citation_count_updated":"","msr_published_date":"2023-5-1","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"https:\/\/iclr.cc\/","msr_journal_url":"","msr_s2_pdf_url":"","msr_year":0,"msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_match_confidence":0,"msr_microsoftintellectualproperty":true,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":null,"footnotes":""},"msr-research-highlight":[],"research-area":[13556,13562],"msr-publication-type":[193716],"msr-publisher":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[246688,266091,246685],"msr-conference":[259120],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-925464","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-artificial-intelligence","msr-research-area-computer-vision","msr-locale-en_us","msr-field-of-study-computer-vision","msr-field-of-study-few-shot-learning","msr-field-of-study-machine-learning"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2023-5-1","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"IEEE","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/openreview.net\/pdf?id=wq0luyH3m4","label_id":"243109","label":0}],"msr_related_uploader":"","msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"text","value":"Samyadeep Basu","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Megan Stanley","user_id":41482,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Megan Stanley"},{"type":"guest","value":"john-bronskill","user_id":774337,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=john-bronskill"},{"type":"text","value":"Soheil Feizi","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Daniela Massiceti","user_id":40408,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Daniela Massiceti"}],"msr_impact_theme":[],"msr_research_lab":[199561],"msr_event":[],"msr_group":[606351,1142579],"msr_project":[830104],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":830104,"post_title":"Teachable AI Experiences (Tai X)","post_name":"taix","post_type":"msr-project","post_date":"2022-03-31 06:56:26","post_modified":"2026-02-23 02:38:13","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/taix\/","post_excerpt":"The Teachable AI Experiences team (Tai X) aims to innovate teachable AI systems that allow people near or far from the norm to create meaningful personalized experiences for themselves. What we ALL have in common is that we are unique. Millions of people find that they&nbsp;do not fit&nbsp;into&nbsp;one of the&nbsp;coarse-grained buckets that have become the technical underpinning of our AI technologies of today (See Research Talk: Bucket of Me). While we can attempt to shoehorn&hellip;","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/830104"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/925464","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":4,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/925464\/revisions"}],"predecessor-version":[{"id":1143415,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/925464\/revisions\/1143415"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=925464"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=925464"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=925464"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=925464"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=925464"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=925464"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=925464"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=925464"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=925464"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=925464"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=925464"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=925464"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=925464"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}