{"id":1141870,"date":"2025-06-17T09:00:00","date_gmt":"2025-06-17T16:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=1141870"},"modified":"2025-06-17T06:32:47","modified_gmt":"2025-06-17T13:32:47","slug":"new-methods-boost-reasoning-in-small-and-large-language-models","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/new-methods-boost-reasoning-in-small-and-large-language-models\/","title":{"rendered":"New methods boost reasoning in small and large language models"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1.jpg\" alt=\"The image shows a diagram illustrating the relationship between mathematical statements in natural language and formal language. On the left, there is a blue box labeled \"Mathematical statement in natural language.\" An arrow points from this box to a central section containing four smaller boxes arranged in two rows. The top row contains \"Formalization\" and \"Informalization,\" while the bottom row contains \"Symbolic Equivalence\" and \"Semantic Consistency.\" An arrow points from this central section to a purple box on the right labeled \"Mathematical statement in formal language.\" The background of the image transitions from blue on the left to purple on the right.\" class=\"wp-image-1142121\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1.jpg 1400w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<p>Artificial intelligence is advancing across a wide range of fields, with one of the most important developments being its growing capacity for reasoning. This capability could help AI becomes a reliable partner in critical domains like scientific research and healthcare.<\/p>\n\n\n\n<p>To support this progress, we\u2019ve identified three primary strategies to strengthen reasoning capabilities in both small and large language models: improve architectural design to boost performance in smaller models; incorporate mathematical reasoning techniques to increase reliability; and build stronger generalization capabilities to enable reasoning across a variety of fields.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"smarter-reasoning-in-smaller-models\">Smarter reasoning in smaller models<\/h2>\n\n\n\n<p>While language models trained on broad world knowledge hold great potential, they lack the ability to learn continuously and refine their understanding. This limitation becomes especially pronounced in smaller models, where limited capacity makes strong reasoning even harder.<\/p>\n\n\n\n<p>The problem stems from how current language models operate. They rely on fast, pattern recognition-based responses that break down in complex scenarios. In contrast, people use deliberate, step-by-step reasoning, test different approaches, and evaluate outcomes. To address this gap, we\u2019re building methods to enable stronger reasoning in smaller systems.<\/p>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/articles\/li-zhang-rstar-math\/\" target=\"_blank\" rel=\"noreferrer noopener\">rStar-Math<\/a> is a method that uses Monte Carlo Tree Search (MCTS) to simulate deeper, more methodical reasoning in smaller models. It uses a three-step, self-improving cycle:&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Problem decomposition<\/strong> breaks down complex mathematical problems into manageable steps, creating a thorough and accurate course of reasoning.<\/li>\n\n\n\n<li><strong>Process preference model (PPM)<\/strong> trains small models to predict reward labels for each step, improving process-level supervision.<\/li>\n\n\n\n<li><strong>Iterative refinement<\/strong> applies a four-round, self-improvement cycle in which updated strategy models and PPMs guide MCTS to improve performance.&nbsp;<\/li>\n<\/ul>\n\n\n\n<p>When tested on four small language models ranging from 1.5 billion to 7 billion parameters, rStar-Math achieved an average accuracy of 53% on the American Invitational Mathematics Examination (AIME)\u2014performance that places it among the top 20% of high school competitors in the US.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1049\" height=\"322\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-1.png\" alt=\"Figure 1: A three-part diagram illustrating the rStar-Math framework. (a) Shows an MCTS-driven reasoning tree with Q-values and answer verification using PPM or Python; correct and incorrect steps are marked. (b) Depicts how Q-value filtering constructs per-step preference pairs from partial to full solutions. (c) Outlines four rounds of self-evolution, alternating between SLM and PPM improvements using terminal-guided and PPM-augmented MCTS.\" class=\"wp-image-1141894\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-1.png 1049w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-1-300x92.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-1-1024x314.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-1-768x236.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-1-240x74.png 240w\" sizes=\"auto, (max-width: 1049px) 100vw, 1049px\" \/><figcaption class=\"wp-element-caption\">Figure 1. The rStar-Math framework <\/figcaption><\/figure>\n\n\n\n<p>Logic-RL is a reinforcement learning framework that strengthens logical reasoning through a practical system prompt and a structured reward function. By training models on logic puzzles, Logic-RL grants rewards only when both the reasoning process and the final answer meet strict formatting requirements. This prevents shortcuts and promotes analytical rigor.<\/p>\n\n\n\n<p>Language models trained with Logic-RL demonstrate strong performance beyond logic puzzles, generalizing effectively to mathematical competition problems. On the AIME and AMC (American Mathematics Competitions) datasets, 7-billion-parameter models improved accuracy by 125% and 38%, respectively, compared with baseline models.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"building-reliable-mathematical-reasoning\">Building reliable mathematical reasoning&nbsp;<\/h2>\n\n\n\n<p>Mathematics poses a unique challenge for language models, which often struggle to meet its precision and rigor using natural language. To address this, we\u2019re creating formal and symbolic methods to enable language models to adopt structured mathematical tools. The goal is to convert language model outputs into code based on the fundamental rules of arithmetic, like 1 + 1 = 2, allowing us to systematically verify accuracy.&nbsp;<\/p>\n\n\n\n<p><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/proving-olympiad-inequalities-by-synergizing-llms-and-symbolic-reasoning\/\" target=\"_blank\" rel=\"noreferrer noopener\">LIPS<\/a> (LLM-based Inequality Prover with Symbolic Reasoning) is a system that combines LLMs\u2019 pattern recognition capabilities with symbolic reasoning. LIPS draws on the strategies participants in math competitions use in order to distinguish between tasks best suited to symbolic solvers (e.g., scaling) and those better handled by language models (e.g., rewriting). On 161 Olympiad-level problems, LIPS achieved state-of-the-art results without additional training data.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1295\" height=\"426\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-2.png\" alt=\"Figure 2: A three-part diagram showing the LIPS framework for inequality proof generation. On the left, a current inequality problem is transformed into new inequality subproblems via tactic generation using symbolic-based and LLM-generated rewriting methods. In the center, these new goals are filtered and ranked using LLM and symbolic methods. On the right, a ranked sequence of inequalities forms a complete proof, applying named tactics like Cauchy-Schwarz, AM-GM, and LLM simplification, ending with the original inequality verified.\" class=\"wp-image-1141898\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-2.png 1295w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-2-300x99.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-2-1024x337.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-2-768x253.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-2-240x79.png 240w\" sizes=\"auto, (max-width: 1295px) 100vw, 1295px\" \/><figcaption class=\"wp-element-caption\">Figure 2. An overview of LIPS<\/figcaption><\/figure>\n\n\n\n<p>However, translating natural-language math problems into precise, machine-readable formats is a challenge. Our goal is to bridge the gap between the one-pass success rate, where the top-ranked generated result is correct, and the k-pass success rate, where at least one of the top <em>k<\/em> generated results is correct.<\/p>\n\n\n\n<p>We developed a <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/autoformalizing-mathematical-statements-by-symbolic-equivalence-and-semantic-consistency\/\">new framework<\/a> using two evaluation methods. <strong>Symbolic equivalence<\/strong> checks whether outputs are logically identical, while <strong>semantic consistency<\/strong> uses embedding similarity to detect subtle differences missed by symbolic checks.<\/p>\n\n\n\n<p>When we evaluated this approach on the MATH and miniF2F datasets, which include problems from various math competitions, it improved accuracy by up to 1.35 times over baseline methods.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1271\" height=\"377\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-3.png\" alt=\"Figure 3: A flowchart illustrating the autoformalization framework. On the left, a natural language math statement is converted into a formal language theorem via an \"Autoformalize\" step. In the center, formal statements undergo symbolic equivalence checks, while informalized versions are evaluated for semantic consistency. Arrows represent symbolic and semantic equivalence, informalization, and scoring. On the right, validated formal statements are output, demonstrating multiple logically equivalent formulations. A legend explains arrow types for formalization, equivalence, and output scoring.\" class=\"wp-image-1141897\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-3.png 1271w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-3-300x89.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-3-1024x304.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-3-768x228.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-3-240x71.png 240w\" sizes=\"auto, (max-width: 1271px) 100vw, 1271px\" \/><figcaption class=\"wp-element-caption\">Figure 3. An overview of the auto-formalization framework<\/figcaption><\/figure>\n\n\n\n<p>To address the shortage of high-quality training data, we developed a <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/neuro-symbolic-data-generation-for-math-reasoning\/\" target=\"_blank\" rel=\"noreferrer noopener\">neuro-symbolic framework<\/a> that automatically generates diverse, well-structured math problems. Symbolic solvers create the problems, while language models translate them into natural language. This approach not only broadens training resources but also supports more effective instruction and evaluation of mathematical reasoning in language models.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1255\" height=\"404\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-4.png\" alt=\"Figure 4: A flowchart illustrating the neuro-symbolic data generation framework. It begins with a natural language math problem about a sandbox's perimeter. This is formalized into symbolic assertions, then mutated while preserving structure. The formal problem is solved and informalized into a new natural language Q&A about a garden's dimensions. The process continues with further mutation to generate problems of varying difficulty\u2014examples include an easy question about a rectangle\u2019s width and a medium one involving expressions for area.\" class=\"wp-image-1142036\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-4.png 1255w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-4-300x97.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-4-1024x330.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-4-768x247.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-4-240x77.png 240w\" sizes=\"auto, (max-width: 1255px) 100vw, 1255px\" \/><figcaption class=\"wp-element-caption\">Figure 4. An overview of the neuro-symbolic data generation framework <\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"boosting-generalization-across-domains\">Boosting generalization across domains&nbsp;<\/h2>\n\n\n\n<p>A key indicator of advanced AI is its ability to generalize\u2014the ability to transfer reasoning skills across different domains. We found that training language models on math data significantly improved performance in coding, science, and other areas, revealing unexpected cross-domain benefits.&nbsp;<\/p>\n\n\n\n<p>This discovery motivated us to develop <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/chain-of-reasoning-towards-unified-mathematical-reasoning-in-large-language-models-via-a-multi-paradigm-perspective\/\">Chain-of-Reasoning<\/a> (CoR), an approach that unifies reasoning across natural language, code, and symbolic forms. CoR lets models blend these formats using natural language to frame context, code for precise calculations, and symbolic representations for abstraction. By adjusting prompts, CoR adapts both reasoning depth and paradigm diversity to match specific problem requirements.&nbsp;<\/p>\n\n\n\n<p>Tests of CoR across five math datasets showed its ability to tackle both computational and proof-based problems, demonstrating strong general mathematical problem-solving skills.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1183\" height=\"493\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-5.png\" alt=\"Figure 5: Diagram illustrating three reasoning paradigms: (a) Single-paradigm reasoning, where all reasoning steps use the same medium (e.g., natural language, algorithms, or symbols); (b) Tool-integrated single-paradigm reasoning, where natural language drives reasoning, but code is used to solve specific sub-problems, with results reintegrated into the language-based reasoning; (c) CoR (multi-paradigm) reasoning framework, which enables reasoning across different paradigms with varying depths to handle diverse problem types, supported by examples.\" class=\"wp-image-1141896\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-5.png 1183w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-5-300x125.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-5-1024x427.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-5-768x320.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-5-240x100.png 240w\" sizes=\"auto, (max-width: 1183px) 100vw, 1183px\" \/><figcaption class=\"wp-element-caption\">Figure 5. CoR\u2019s reasoning process under different types of methods<\/figcaption><\/figure>\n\n\n\n<p>Current language models often rely on domain-specific solutions, limiting their flexibility across different types of problems. To move beyond this constraint, we developed <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/articles\/cpl\/\" target=\"_blank\" rel=\"noreferrer noopener\">Critical Plan Step Learning<\/a> (CPL), an approach focused on high-level abstract planning that teaches models to identify key knowledge, break down problems, and make strategic decisions.&nbsp;<\/p>\n\n\n\n<p>The technique draws on how people solve problems, by breaking them down, identifying key information, and recalling relevant knowledge\u2014strategies we want language models to learn.&nbsp;<\/p>\n\n\n\n<p>CPL combines two key components: <strong>plan-based MCTS<\/strong>, which searches multi-step solution paths and constructs planning trees, and <strong>step-APO<\/strong>, which learns preferences for strong intermediate steps while filtering out weak ones. This combination enhances reasoning and improves generalization across tasks, moving AI systems closer to the flexible thinking that characterizes human intelligence.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1238\" height=\"585\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-6.png\" alt=\"Figure 6: Illustration of CPL. Left: Plans represent abstract thinking for problem-solving, which allows for better generalization, whereas task-specific solutions often limit it. Right: CPL searches within the action space on high-level abstract plans using MCTS and obtains advantage estimates for step-level preferences. CPL can then identify and learn critical steps that provide a distinct advantage over others.\" class=\"wp-image-1141895\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-6.png 1238w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-6-300x142.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-6-1024x484.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-6-768x363.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/enhancing-llm-reasoning-abilities-6-240x113.png 240w\" sizes=\"auto, (max-width: 1238px) 100vw, 1238px\" \/><figcaption class=\"wp-element-caption\">Figure 6. Overview of the CPL framework<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"looking-ahead-next-steps-in-ai-reasoning\">Looking ahead: Next steps in AI reasoning<\/h2>\n\n\n\n<p>From building reliable math solvers to unifying reasoning approaches, researchers are redefining how language models approach complex tasks. Their work sets the stage for more capable and versatile AI systems\u2014applicable to education, science, healthcare, and beyond. Despite these advances, hallucinations and imprecise logic continue to pose risks in critical fields like medicine and scientific research, where accuracy is essential.<\/p>\n\n\n\n<p>These challenges are driving the team\u2019s exploration of additional tools and frameworks to improve language model reasoning. This includes <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/autoverus-automated-proof-generation-for-rust-code\/\">AutoVerus<\/a> for automated proof generation in Rust code, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/automated-proof-generation-for-rust-code-via-self-evolution\/\">SAFE<\/a> for addressing data scarcity in Rust formal verification, and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/alchemy-amplifying-theorem-proving-capability-through-symbolic-mutation\/\">Alchemy<\/a>, which uses symbolic mutation to improve neural theorem proving.<\/p>\n\n\n\n<p>Together, these technologies represent important progress toward building trustworthy, high-performing reasoning models and signal a broader shift toward addressing some of AI&#8217;s current limitations.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>New techniques are reimagining how LLMs reason. By combining symbolic logic, mathematical rigor, and adaptive planning, these methods enable models to tackle complex, real-world problems across a variety of fields.<\/p>\n","protected":false},"author":43868,"featured_media":1142121,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"Li Lyna Zhang","user_id":"38121"},{"type":"user_nicename","value":"Xian Zhang","user_id":"37869"},{"type":"user_nicename","value":"Xueting Han","user_id":"43900"},{"type":"user_nicename","value":"Dongdong Zhang","user_id":"31677"}],"msr_hide_image_in_river":null,"footnotes":""},"categories":[1],"tags":[],"research-area":[13556],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[269148,243984,269142],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1141870","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-post-option-approved-for-river","msr-post-option-blog-homepage-featured","msr-post-option-include-in-river"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199560],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Xian Zhang","user_id":37869,"display_name":"Xian Zhang","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/zhxian\/\" aria-label=\"Visit the profile page for Xian Zhang\">Xian Zhang<\/a>","is_active":false,"last_first":"Zhang, Xian","people_section":0,"alias":"zhxian"},{"type":"user_nicename","value":"Xueting Han","user_id":43900,"display_name":"Xueting Han","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/chrihan\/\" aria-label=\"Visit the profile page for Xueting Han\">Xueting Han<\/a>","is_active":false,"last_first":"Han, Xueting","people_section":0,"alias":"chrihan"},{"type":"user_nicename","value":"Dongdong Zhang","user_id":31677,"display_name":"Dongdong Zhang","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/dozhang\/\" aria-label=\"Visit the profile page for Dongdong Zhang\">Dongdong Zhang<\/a>","is_active":false,"last_first":"Zhang, Dongdong","people_section":0,"alias":"dozhang"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-960x540.jpg\" class=\"img-object-cover\" alt=\"The image shows a diagram illustrating the relationship between mathematical statements in natural language and formal language. On the left, there is a blue box labeled &quot;Mathematical statement in natural language.&quot; An arrow points from this box to a central section containing four smaller boxes arranged in two rows. The top row contains &quot;Formalization&quot; and &quot;Informalization,&quot; while the bottom row contains &quot;Symbolic Equivalence&quot; and &quot;Semantic Consistency.&quot; An arrow points from this central section to a purple box on the right labeled &quot;Mathematical statement in formal language.&quot; The background of the image transitions from blue on the left to purple on the right.\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/06\/NewMethods-BlogHeroFeature-1400x788-1.jpg 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"Li Lyna Zhang, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/zhxian\/\" title=\"Go to researcher profile for Xian Zhang\" aria-label=\"Go to researcher profile for Xian Zhang\" data-bi-type=\"byline author\" data-bi-cN=\"Xian Zhang\">Xian Zhang<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/chrihan\/\" title=\"Go to researcher profile for Xueting Han\" aria-label=\"Go to researcher profile for Xueting Han\" data-bi-type=\"byline author\" data-bi-cN=\"Xueting Han\">Xueting Han<\/a>, and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/dozhang\/\" title=\"Go to researcher profile for Dongdong Zhang\" aria-label=\"Go to researcher profile for Dongdong Zhang\" data-bi-type=\"byline author\" data-bi-cN=\"Dongdong Zhang\">Dongdong Zhang<\/a>","formattedDate":"June 17, 2025","formattedExcerpt":"New techniques are reimagining how LLMs reason. By combining symbolic logic, mathematical rigor, and adaptive planning, these methods enable models to tackle complex, real-world problems across a variety of fields.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1141870","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/43868"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1141870"}],"version-history":[{"count":37,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1141870\/revisions"}],"predecessor-version":[{"id":1142250,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1141870\/revisions\/1142250"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1142121"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1141870"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1141870"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1141870"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1141870"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1141870"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1141870"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1141870"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1141870"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1141870"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1141870"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1141870"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}