{"id":1155409,"date":"2025-12-04T04:12:24","date_gmt":"2025-12-04T12:12:24","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=1155409"},"modified":"2025-12-17T01:13:44","modified_gmt":"2025-12-17T09:13:44","slug":"agentic-verifiers-provably-safe-test-time-scaling-for-reasoning-models","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/agentic-verifiers-provably-safe-test-time-scaling-for-reasoning-models\/","title":{"rendered":"Agentic Verifiers: Provably Safe Test-time scaling for Reasoning Models"},"content":{"rendered":"<section class=\"mb-3 moray-highlight\">\n\t<div class=\"card-img-overlay mx-lg-0\">\n\t\t<div class=\"card-background  has-background- card-background--full-bleed\">\n\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1920\" height=\"721\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/11\/Agentic-Verifiers_-Provably-Safe-Test-time-scaling-for-Reasoning-Models_Banner-1920x721-1.jpg\" class=\"attachment-full size-full\" alt=\"background pattern\" style=\"\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/11\/Agentic-Verifiers_-Provably-Safe-Test-time-scaling-for-Reasoning-Models_Banner-1920x721-1.jpg 1920w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/11\/Agentic-Verifiers_-Provably-Safe-Test-time-scaling-for-Reasoning-Models_Banner-1920x721-1-300x113.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/11\/Agentic-Verifiers_-Provably-Safe-Test-time-scaling-for-Reasoning-Models_Banner-1920x721-1-1024x385.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/11\/Agentic-Verifiers_-Provably-Safe-Test-time-scaling-for-Reasoning-Models_Banner-1920x721-1-768x288.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/11\/Agentic-Verifiers_-Provably-Safe-Test-time-scaling-for-Reasoning-Models_Banner-1920x721-1-1536x577.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/11\/Agentic-Verifiers_-Provably-Safe-Test-time-scaling-for-Reasoning-Models_Banner-1920x721-1-1600x600.jpg 1600w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/11\/Agentic-Verifiers_-Provably-Safe-Test-time-scaling-for-Reasoning-Models_Banner-1920x721-1-240x90.jpg 240w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/>\t\t<\/div>\n\t\t<!-- Foreground -->\n\t\t<div class=\"card-foreground d-flex mt-md-n5 my-lg-5 px-g px-lg-0\">\n\t\t\t<!-- Container -->\n\t\t\t<div class=\"container d-flex mt-md-n5 my-lg-5 \">\n\t\t\t\t<!-- Card wrapper -->\n\t\t\t\t<div class=\"w-100 w-lg-col-5\">\n\t\t\t\t\t<!-- Card -->\n\t\t\t\t\t<div class=\"card material-md-card py-5 px-md-5\">\n\t\t\t\t\t\t<div class=\"card-body \">\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n<h1 class=\"wp-block-heading\" id=\"agentic-verifiers-provably-safe-test-time-scaling-for-reasoning-models\">Agentic Verifiers: Provably Safe Test-time scaling for Reasoning Models<\/h1>\n\n\n\n<p><\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n<p>This project introduces a novel architecture for agentic AI systems that ensures accuracy, efficiency, and safety during reasoning. It addresses two key challenges\u2014lack of steerability and absence of verifiable guarantees\u2014by developing verifiers that can interject at any point in a model\u2019s generation process. An auxiliary monitor model evaluates each reasoning step against predefined properties, rolling back and correcting errors in real time. The research spans commonsense, medical, and legal reasoning, aiming to deliver publicly available verifier models and an open-source platform for integrating verification into agentic systems, paving the way for trustworthy AI in high-stakes domains.<br><br>This research is conducted via&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/academic-program\/agentic-ai-research-and-innovation\/\">The Agentic AI Research and Innovation&nbsp;<\/a>(AARI) Initiative which focuses on the next frontier of agentic systems through&nbsp;<em>Grand Challenges<\/em>&nbsp;with the academic community and Microsoft Research.<\/p>\n\n\n","protected":false},"excerpt":{"rendered":"<p>This project introduces a novel architecture for agentic AI systems that ensures accuracy, efficiency, and safety during reasoning. It addresses two key challenges\u2014lack of steerability and absence of verifiable guarantees\u2014by developing verifiers that can interject at any point in a model\u2019s generation process. An auxiliary monitor model evaluates each reasoning step against predefined properties, rolling [&hellip;]<\/p>\n","protected":false},"featured_media":1155701,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-1155409","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"guest","display_name":"Somak  Aditya","user_id":1157209,"people_section":"Section name 0","alias":""},{"type":"guest","display_name":"Sourangshu Bhattacharya","user_id":1157211,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Vineeth N Balasubramanian","user_id":44019,"people_section":"Section name 0","alias":"vineethn"},{"type":"user_nicename","display_name":"Nagarajan Natarajan","user_id":37311,"people_section":"Section name 0","alias":"nagarajn"},{"type":"guest","display_name":"Uma  Satya Ranjan","user_id":1157216,"people_section":"Section name 0","alias":""},{"type":"user_nicename","display_name":"Amit Sharma","user_id":30997,"people_section":"Section name 0","alias":"amshar"}],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1155409","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":11,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1155409\/revisions"}],"predecessor-version":[{"id":1158811,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/1155409\/revisions\/1158811"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1155701"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1155409"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1155409"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1155409"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1155409"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1155409"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}