{"id":1149210,"date":"2025-09-11T09:00:00","date_gmt":"2025-09-11T16:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=1149210"},"modified":"2025-11-26T14:37:59","modified_gmt":"2025-11-26T22:37:59","slug":"tool-space-interference-in-the-mcp-era-designing-for-agent-compatibility-at-scale","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/tool-space-interference-in-the-mcp-era-designing-for-agent-compatibility-at-scale\/","title":{"rendered":"Tool-space interference in the MCP era: Designing for agent compatibility at scale"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-1024x576.jpg\" alt=\"Three white icons on a gradient background transitioning from blue to purple to pink. From left to right: a globe with a magnifying glass representing internet search, a central circle connected to smaller circles symbolizing network connectivity, and a checklist with two checkmarks and one empty box indicating task management.\" class=\"wp-image-1149369\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1.jpg 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>This year&nbsp;we\u2019ve&nbsp;seen&nbsp;remarkable&nbsp;advances in agentic AI, including&nbsp;systems that conduct deep research,&nbsp;operate&nbsp;computers, complete substantial software engineering tasks, and tackle a range of other complex,&nbsp;multi-step goals. In each case,&nbsp;the industry relied&nbsp;on careful vertical integration: tools and agents were co-designed, co-trained, and tested together&nbsp;for peak&nbsp;performance. For example,&nbsp;OpenAI&#8217;s&nbsp;recent models&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/openai\/gpt-oss?tab=readme-ov-file#tools\" target=\"_blank\" rel=\"noopener noreferrer\">presume&nbsp;the&nbsp;availability&nbsp;of web search and document retrieval&nbsp;tools<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Likewise,&nbsp;the prompts and actions&nbsp;of&nbsp;<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/articles\/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks\/\" target=\"_blank\" rel=\"noreferrer noopener\">Magentic-One<\/a>&nbsp;are&nbsp;set up to make hand-offs easy\u2014for example, allowing the WebSurfer agent to pass downloaded files to the Coder agent.\u202f&nbsp;But\u202fas agents proliferate, we anticipate strategies relying heavily on vertical integration will not age well.&nbsp;Agents&nbsp;from&nbsp;different&nbsp;developers&nbsp;or companies will&nbsp;increasingly&nbsp;encounter&nbsp;each other and&nbsp;must&nbsp;work together to complete tasks, in what we refer to as a&nbsp;<em>society of agents<\/em>.&nbsp;These systems can vary in how coordinated they are, how aligned their goals are, and how much information they share. Can heterogenous agents and tools cooperate&nbsp;in this&nbsp;setting, or will they hinder one another and slow progress?<\/p>\n\n\n\n<p>Early clues have&nbsp;emerged&nbsp;from an&nbsp;unexpected&nbsp;source:&nbsp;namely,&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/modelcontextprotocol.io\/\" target=\"_blank\" rel=\"noopener noreferrer\">Model Context Protocol<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;(MCP). Since January 2025, MCP has\u202fgrown from a&nbsp;promising spec to a&nbsp;thriving&nbsp;market&nbsp;of&nbsp;tool&nbsp;servers.&nbsp;As an example,\u202f<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/docs.zapier.com\/mcp\/home\" target=\"_blank\" rel=\"noopener noreferrer\">Zapier boasts a catalog of 30,000 tools<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;across 7,000 services.&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/mcp.composio.dev\/\" target=\"_blank\" rel=\"noopener noreferrer\">Composio&nbsp;provide over 100 managed MCP servers<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, surfacing hundreds of tools. Hugging&nbsp;Face is now serving&nbsp;many&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/huggingface.co\/spaces?filter=mcp-server\" target=\"_blank\" rel=\"noopener noreferrer\">Spaces&nbsp;apps over MCP<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, and\u202f<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/shopify.dev\/docs\/apps\/build\/storefront-mcp\/servers\/storefront\" target=\"_blank\" rel=\"noopener noreferrer\">Shopify has enabled MCP for millions of storefronts<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.&nbsp;A&nbsp;society of&nbsp;<em>tools<\/em>&nbsp;is already here, and it promises to&nbsp;extend&nbsp;agent capabilities through&nbsp;cross-provider&nbsp;horizontal integration.&nbsp;<\/p>\n\n\n\n<p>So,&nbsp;what does MCP have to say about&nbsp;horizontal integration? As catalogs grow,&nbsp;we expect some new failure modes to surface.&nbsp;This&nbsp;blog&nbsp;post introduces&nbsp;these&nbsp;as\u202f<em>tool-space interference<\/em>, and sketches both early observations\u202fand some pragmatic interventions to keep the society&nbsp;we\u2019re&nbsp;building&nbsp;from stepping on its own feet.&nbsp;<\/p>\n\n\n\n<p>Tool-space interference describes situations where otherwise reasonable tools or agents, when co-present, reduce end-to-end effectiveness. This can look like longer action sequences, higher token cost, brittle recovery from errors, or, in some cases, task failure.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"a-framing-example\">A framing example<\/h2>\n\n\n\n<p>Consider MCP as a means for extending <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/magentic-one-a-generalist-multi-agent-system-for-solving-complex-tasks\/\">Magentic-One<\/a>, a generalist multi-agent system we released last year, to cover more software engineering tasks. Magentic-One ships with agents to write code, interact with the computer terminal, browse the web, and access local files. To help Magentic-One navigate version control, find issues to solve, and make pull requests, we could add an agent equipped with the GitHub MCP Server. However, now each time the team encounters a task involving GitHub, it must choose whether to visit github.com in the browser, execute a git command at the command line, or engage the GitHub MCP server. As the task progresses, agent understanding of state can also diverge: changing the branch in the browser won\u2019t change the branch in the terminal, and an authorized MCP tool does not imply authorization in the browser.&nbsp;Thus, while any single agent might complete the task efficiently, the larger set of agents might misunderstand or interfere with one another, leading to additional rounds of debugging, or even complete task failure.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1021\" height=\"410\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image.png\" alt=\"Diagram depicting Magentic-One's multi-agentic architecture. An Orchestrator agent has access to 4 specialized sub-agents: a Coder agent that can write code and reason to sol solve tasks, a Computer Terminal Agent that can execute code written by the Coder agent, a WebSurfer agent that browse the internet (navigate pages, fill forms, etc), and a FileSurfer agent that can navigate files (e.g. PDFs, PPTx, etc). The diagram is annotated to show that for any incoming git-related task, the Orchestrator agent has to decide at evert orchestration step whether to access Git CLI via ComputerTerminal, visit Github site via WebSurfer, or directly access Github\u2019s MCP server.\" class=\"wp-image-1149211\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image.png 1021w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-300x120.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-768x308.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-240x96.png 240w\" sizes=\"auto, (max-width: 1021px) 100vw, 1021px\" \/><figcaption class=\"wp-element-caption\">Figure 1: We can extend&nbsp;Magentic-One by adding an agent that equips the GitHub MCP server. However, on every turn involving a git-related task, the orchestrator will need to decide between messaging the Computer Terminal agent (with access to the git command line interface), WebSurfer agent (with access to github.com), and the agent with the GitHub MCP server. This overlap raises the possibility that they will interfere with one another.&nbsp;&nbsp;<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"tool-space-interference-through-the-lens-of-mcp\">Tool-space interference, through the lens of MCP<\/h2>\n\n\n\n<p>To better understand the potential interference patterns and the current state of the MCP ecosystem, we conducted a survey of MCP servers listed on two registries: <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/smithery.ai\/\">smithery.ai<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> and <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/hub.docker.com\/mcp\">Docker MCP Hub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Smithery is an MCP Server registry with over 7,000 first-party and community-contributed servers, which we sampled from the Smithery API. Likewise, Docker MCP Hub is a registry that distributes MCP servers as Docker images, and we manually collected popular entries. We then launched each server for inspection. After excluding servers that were empty or failed to launch, and deduplicating servers with identical features, 1,470 servers remained in our catalog.<\/p>\n\n\n\n<p>To&nbsp;automate the&nbsp;inspection&nbsp;of&nbsp;running MCP servers,&nbsp;we developed an&nbsp;MCP&nbsp;Interviewer&nbsp;tool.&nbsp;The MCP&nbsp;Interviewer&nbsp;begins by cataloging the server\u2019s tools, prompts, resources, resource templates, and capabilities.&nbsp;From&nbsp;this catalog we can compute&nbsp;descriptive statistics&nbsp;such as the number of tools, or the depth of the parameter&nbsp;schemas.&nbsp;&nbsp;Then, given the list of available tools, the interviewer uses&nbsp;an LLM (in our case,&nbsp;OpenAI&#8217;s GPT-4.1)&nbsp;to construct a functional testing&nbsp;plan&nbsp;that&nbsp;calls each tool at least once, collecting outputs, errors, and statistics along the way. Finally,&nbsp;the&nbsp;interviewer&nbsp;can&nbsp;also&nbsp;grade&nbsp;more qualitative&nbsp;criteria&nbsp;by&nbsp;using&nbsp;an LLM&nbsp;to&nbsp;apply purpose-built rubrics&nbsp;to&nbsp;tool&nbsp;schemas&nbsp;and&nbsp;tool call outputs.&nbsp;&nbsp;We are excited to&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/mcp-interviewer\" target=\"_blank\" rel=\"noopener noreferrer\">release the MCP Interviewer&nbsp;as an open-source CLI&nbsp;tool<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, so server developers can automatically evaluate their MCP servers with agent usability in mind,&nbsp;and users can&nbsp;validate&nbsp;new servers.&nbsp;<\/p>\n\n\n\n<p>While our survey provides informative initial results, it also faces significant limitations, the most obvious of which is authorization: many of the most popular MCP servers provide access to services that require authorization to use, hindering automated analysis. We are often still able to collect static features from these servers but are limited in the functional testing that can be done.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"one-size-fits-all-but-some-more-than-others\">One-size fits all (but some more than others)<\/h3>\n\n\n\n<p>So, what does our survey of MCP servers tell us about the MCP ecosystem? We will get into the numbers in a moment, but as we contemplate the statistics, there is one overarching theme to keep in mind: MCP servers do not know which clients or models they are working with, and present one common set of tools, prompts, and resources to everyone. However, some models handle long contexts and large tool spaces better than others (with diverging hard limits), and respond quite differently to common prompting patterns. For example, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/platform.openai.com\/docs\/guides\/function-calling#best-practices-for-defining-functions\">OpenAI\u2019s guide on function calling<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> advises developers to:<\/p>\n\n\n\n<p>\u201c<em>Include examples and edge cases, especially to rectify any recurring failures. (Note: Adding examples may hurt performance for reasoning models).\u201d<\/em><\/p>\n\n\n\n<p>So already, this places MCP at a disadvantage over vertical integrations that optimize to the operating environment. And with that, let\u2019s dive into more numbers.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"tool-count\">Tool count<\/h3>\n\n\n\n<p>While models generally vary in their proficiency for tool calling, the general trend has been that performance drops as the number of tools increases. For example, OpenAI limits developers to 128 tools, but <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/platform.openai.com\/docs\/guides\/function-calling#best-practices-for-defining-functions\">recommends<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> that developers:<\/p>\n\n\n\n<p>\u201c<em>Keep the number of functions small for higher accuracy. Evaluate your performance with different numbers of functions. Aim for fewer than 20 functions at any one time, though this is just a soft suggestion.<\/em>\u201d<\/p>\n\n\n\n<p>While we expect this to improve with each new model generation, at present, large tool spaces can <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/2505.10570v1\">lower performance by up to 85% for some models<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. Thankfully, the majority of servers in our survey contain four or fewer tools. But there are outliers: the largest MCP server we cataloged adds 256 distinct tools, while the 10 next-largest servers add more than 100 tools each. Further down the list we find popular servers like <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/microsoft\/playwright-mcp\">Playwright-MCP<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> (29 tools, at the time of this writing), and GitHub MCP (91 tools, with subsets available at alternative endpoint URLs), which might be too large for some models.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"1024\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/tool-counts-per-server-1024x1024.png\" alt=\"chart\" class=\"wp-image-1149361\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/tool-counts-per-server-1024x1024.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/tool-counts-per-server-300x300.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/tool-counts-per-server-150x150.png 150w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/tool-counts-per-server-768x768.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/tool-counts-per-server-1536x1536.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/tool-counts-per-server-2048x2048.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/tool-counts-per-server-180x180.png 180w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/tool-counts-per-server-360x360.png 360w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Figure 2: The number of tools listed by each catalogued server directly after initialization. Note: servers can change the tools they list at any time, but only 226 servers in our catalog declare this capability.<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"response-length\">Response length<\/h3>\n\n\n\n<p>Tools are generally called in agentic loops, where the output is then fed back into the model as input context. Models have hard limits on input context, but even within these limits, large contexts can drive costs up and performance down, so <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/research.trychroma.com\/context-rot\">practical limits can be much lower<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. MCP offers no guidance on how many tokens a tool call can produce, and the size of some responses can come as a surprise. In our analysis, we consider the 2,443 tool calls across 1,312 unique tools that the MCP Interviewer was able to call successfully during the active testing phase of server inspection. While a majority of tools produced 98 or fewer tokens <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/openai\/tiktoken\"><span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, some tools are extraordinarily heavyweight: the top tool returned an average of 557,766 tokens, which is enough to swamp the context windows of many popular models like GPT-5. Further down the list, we find that 16 tools produce more than 128,000 tokens, swamping GPT-4o and other popular models. Even when responses fit into the context window length, overly long responses can significantly degrade performance (<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/arxiv.org\/abs\/2505.10570v1\">up to 91% in one study<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>), and limit the number of future calls that can be made. Of course, agents are free to implement their own context management strategies, but this behavior is left undefined in the MCP specification and server developers cannot count on any particular client behavior or strategy.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><\/td><td><\/td><td colspan=\"4\"><strong># of tools that would overflow context in<\/strong><\/td><\/tr><tr><td><strong>Model<\/strong><\/td><td><strong>Context Window<\/strong><\/td><td><strong>1 call<\/strong><\/td><td><strong>2 calls<\/strong><\/td><td><strong>3-5 calls<\/strong><\/td><td><strong>6-10 calls<\/strong><\/td><\/tr><tr><td>GPT 4.1<\/td><td>1,000,000<\/td><td>0<\/td><td>1<\/td><td>7<\/td><td>11<\/td><\/tr><tr><td>GPT 5<\/td><td>400,000<\/td><td>1<\/td><td>7<\/td><td>15<\/td><td>25<\/td><\/tr><tr><td>GPT-4o, Llama 3.1,<\/td><td>128,000<\/td><td>16<\/td><td>15<\/td><td>33<\/td><td>40<\/td><\/tr><tr><td>Qwen 3<\/td><td>32,000<\/td><td>56<\/td><td>37<\/td><td>86<\/td><td>90<\/td><\/tr><tr><td>Phi-4<\/td><td>16,000<\/td><td>93<\/td><td>60<\/td><td>116<\/td><td>109<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"936\" height=\"935\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-1.png\" alt=\"Chart showing the average tool call output lengths (in tokens) for 1,312 tools, as observed by the MCP Interviewer\u2019s functional test plan. The x-axis represents individual tools (sorted by index), and the y-axis displays the average output length on a logarithmic scale. Horizontal dashed lines indicate context window limits for GPT-4o (128k tokens) and GPT-5 (400k tokens). A pink annotation box summarizes statistics: total tools (1,312), mean (4,431 tokens), median (98 tokens), minimum (0 tokens), and maximum (557,766 tokens).\" class=\"wp-image-1149213\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-1.png 936w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-1-300x300.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-1-150x150.png 150w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-1-768x767.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-1-180x180.png 180w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/image-1-360x360.png 360w\" sizes=\"auto, (max-width: 936px) 100vw, 936px\" \/><figcaption class=\"wp-element-caption\">Figure 3: Tool call response length averages, in tokens, as&nbsp;observed&nbsp;by the MCP Interviewer\u2019s functional test plan. Only successful tool calls are considered. Horizontal lines&nbsp;indicate&nbsp;context window limits for GPT-4o and GPT-5.<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"tool-parameter-complexity\">Tool parameter complexity<\/h3>\n\n\n\n<p>Mirroring the challenges from increasing&nbsp;the&nbsp;number of tools,&nbsp;increasing the complexity of a tool\u2019s parameter space can also lead to degradation.&nbsp;For example, while MCP tools can take complex object types and structures as parameters,&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/composio.dev\/blog\/gpt-4-function-calling-example\" target=\"_blank\" rel=\"noopener noreferrer\">composio<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;found that&nbsp;flattening the parameter space could improve tool-calling performance&nbsp;by 47%&nbsp;compared to baseline performance.&nbsp;&nbsp;In our analysis, we&nbsp;find&nbsp;numerous examples of deeply nested structure\u2014in&nbsp;one&nbsp;case, going&nbsp;20&nbsp;levels deep.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"2560\" height=\"2560\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-scaled.png\" alt=\"Chart showing the maximum depth of each tool\u2019s input properties schema. The x-axis represents individual tools (sorted by index), and the y-axis shows the maximum property schema depth. Most tools have a depth  of 2 (named and annotated properties). A pink annotation box summarizes statistics: total tools (12,643), mean (2.24), median (2.00), standard deviation (1.38), minimum (0.00), and maximum (20.00). \" class=\"wp-image-1149365\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-scaled.png 2560w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-300x300.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-1024x1024.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-150x150.png 150w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-768x768.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-1536x1536.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-2048x2048.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-180x180.png 180w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/input_schema_depth-360x360.png 360w\" sizes=\"auto, (max-width: 2560px) 100vw, 2560px\" \/><figcaption class=\"wp-element-caption\">Figure 4: The maximum depth of each tool\u2019s input properties schema. A depth of 0&nbsp;indicates&nbsp;a tool with no properties. A depth of 1&nbsp;indicates&nbsp;a tool with named properties but no annotations (e.g., no description or type). A depth of 2&nbsp;indicates&nbsp;a tool with named and annotated properties.&nbsp;&nbsp;A depth of 3+&nbsp;indicates&nbsp;a tool with structured properties that have&nbsp;additional&nbsp;nested annotations.&nbsp;<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"namespacing-issues-and-naming-ambiguity\">Namespacing issues and naming ambiguity<\/h3>\n\n\n\n<p>Another often-cited issue with the current MCP specification is the <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/modelcontextprotocol\/modelcontextprotocol\/discussions\/128\">lack of a formal namespace mechanism<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. If two servers are registered to the same agent or application, and the servers have tool names in common, then disambiguation becomes impossible. Libraries like the OpenAI Agents SDK <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/github.com\/openai\/openai-agents-python\/issues\/464\">raise an error<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> under this circumstance. Clients, like Claude Code, prefix tool names with unique identifiers to work around this issue. In our analysis of MCP servers, we found name collisions between 775 tools. The most common collision was \u201csearch\u201d, which appears across 32 distinct MCP servers. The following table lists the top 10 collisions.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Tool Name<\/strong><\/td><td><strong>Number of Instances<\/strong><\/td><\/tr><tr><td><strong>search<\/strong><\/td><td>32<\/td><\/tr><tr><td><strong>get_user<\/strong><\/td><td>11<\/td><\/tr><tr><td><strong>execute_query<\/strong><\/td><td>11<\/td><\/tr><tr><td><strong>list_tables<\/strong><\/td><td>10<\/td><\/tr><tr><td><strong>update_task<\/strong><\/td><td>9<\/td><\/tr><tr><td><strong>generate_image<\/strong><\/td><td>9<\/td><\/tr><tr><td><strong>send_message<\/strong><\/td><td>9<\/td><\/tr><tr><td><strong>execute_command<\/strong><\/td><td>8<\/td><\/tr><tr><td><strong>list_tasks<\/strong><\/td><td>8<\/td><\/tr><tr><td><strong>search_files<\/strong><\/td><td>8<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Even when names are unique, they can be semantically similar. If these tools behave similarly, then the redundancy may not be immediately problematic, but if you are expecting to call a particular tool then the name similarities raise the potential for confusion. The following table lists some examples of semantically similar tool names relating to web search:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>websearch<\/td><td>brave_web_search<\/td><\/tr><tr><td>search-web<\/td><td>tavily_web_search<\/td><\/tr><tr><td>web_search<\/td><td>google_news_search<\/td><\/tr><tr><td>search_web<\/td><td>google-play-search<\/td><\/tr><tr><td>search_webkr<\/td><td>google_search_parsed<\/td><\/tr><tr><td>google_search<\/td><td>search_google_images<\/td><\/tr><tr><td>search_google<\/td><td>get_webset_search_exa<\/td><\/tr><tr><td>ai_web_search<\/td><td>search_google_scholar<\/td><\/tr><tr><td>web_search_exa<\/td><td>duckduckgo_web_search<\/td><\/tr><tr><td>search_web_tool<\/td><td>google_search_scraper<\/td><\/tr><tr><td>web_search_agent<\/td><td>answer_query_websearch<\/td><\/tr><tr><td>batch-web-search<\/td><td>&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"errors-and-error-messages\">Errors and error messages<\/h3>\n\n\n\n<p>Like all software libraries, MCP will occasionally encounter error conditions. In these cases, it is important to provide sufficient information for the agent to handle the error and plan next steps. In our analysis, we found this was not always the case. While MCP provides an \u201cIsError\u201d flag to signal errors, we found that it was common for servers to handle errors by returning strings while leaving this flag set to false, signaling a normal exit. Out of 5,983 tool call results with no error flag, GPT-4.1 judged that 3,536 indicated errors in their content. More worrisome: the error messages were often of low quality. For instance, one tool providing web search capabilities failed with the string \u201cerror: job,\u201d while another tool providing academic search returned \u201cPlease retry with 0 or fewer IDs.\u201d<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"resource-sharing-conventions\">Resource sharing conventions<\/h3>\n\n\n\n<p>Finally, in addition to tools, MCP allows servers to share resources and resource templates with clients. In our survey, only 112 (7.6%) servers reported any resources, while 74 (5%) provided templates. One potential reason for low adoption is that the current MCP specification provides limited guidance for when resources are retrieved, or how they are incorporated into context. One clearcut situation where a client might retrieve a resource is in response to a tool returning a <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/modelcontextprotocol.io\/specification\/2025-06-18\/server\/tools#resource-links\">resource_link<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> as a result &#8212; but only 4 tools exhibited this behavior in our survey (arguably, this would be the ideal behavior for tools that return very long, document-like responses, as outlined earlier).<\/p>\n\n\n\n<p>Conversely, a whole different set of issues arises when there is a need to share resources from the client to the server. Consider for example a tool that provides some analysis of a <em>local<\/em> PDF file. In the case of a local MCP server utilizing STDIO transport, a local file path can be provided as an argument to the tool, but no similar conventions exist for delivering a local file to a remote MCP server. These issues are challenging enough when implementing a single server. When multiple tools or servers need to interact within the same system, the risk of interoperability errors compounds.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"recommendations\">Recommendations<\/h2>\n\n\n\n<p>On balance, along any given dimension, the average MCP server is quite reasonable\u2014but, as we have seen, outliers and diverging assumptions can introduce trouble. While we expect many of these challenges to improve with time, we are comfortable making small recommendations that we feel are evergreen. We organize them below by audience.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"protocol-developers\">Protocol developers<\/h3>\n\n\n\n<p>We recognize the advantages of keeping MCP relatively lightweight, avoiding being overly prescriptive in an environment where AI models and use cases are rapidly changing. However, a few small recommendations are warranted. First, we believe MCP should be extended to include a specification for client-provided resources so that tools on remote servers have a mechanism for operating on specified local files or documents. This would more effectively position MCP as a clearinghouse for resources passed between steps of agentic workflows. The MCP specification would also benefit from taking a more opinionated stance on when resources are retrieved and used overall.<\/p>\n\n\n\n<p>Likewise, we believe&nbsp;MCP should&nbsp;quickly move to&nbsp;provide formal namespaces&nbsp;to eliminate tool name collisions.&nbsp;If namespaces&nbsp;are hierarchical, then this also provides a way of organizing large catalogs&nbsp;of functions&nbsp;into thematically&nbsp;related tool&nbsp;sets.&nbsp;Tool sets, as an organizing principle,&nbsp;are already showing some promise&nbsp;in&nbsp;GitHub MCP Server\u2019s&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/github\/github-mcp-server?tab=readme-ov-file#dynamic-tool-discovery\" target=\"_blank\" rel=\"noopener noreferrer\">dynamic tool discovery,<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;and VS Code\u2019s&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/code.visualstudio.com\/updates\/v1_103#_tool-grouping-experimental\" target=\"_blank\" rel=\"noopener noreferrer\">tool grouping (with virtual tools)<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,&nbsp;where agents or users&nbsp;can&nbsp;enable and disable tools&nbsp;as needed.&nbsp;&nbsp;In the future,&nbsp;a standardized mechanism for grouping tools would allow&nbsp;<em>clients<\/em>&nbsp;to engage in hierarchical tool-calling,&nbsp;where they first select a category, then select a tool, without needing to keep all possible&nbsp;tools in context.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"server-developers\">Server developers<\/h3>\n\n\n\n<p>While our MCP Interviewer tool can catalog many outward-facing properties of MCP servers, developers are often in a much better position to characterize the nature of their tools. To this end, we believe developers should publish an MCP Server card alongside their servers or services, clearly outlining the runtime characteristics of the tools (e.g., the expected number of tokens generated, or expected latency of a tool call). Ideally developers should also indicate which models, agents and clients the server was tested with, how the tools were tested (e.g., provide sample tasks), list any known incompatibilities, and be mindful of limitations of various models throughout development.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"client-developers\">Client developers<\/h3>\n\n\n\n<p>Client developers have the opportunity to experiment with various mitigations or optimizations that might help the average MCP server work better for a given system or environment. For example, clients could cache tool schemas, serving them as targets for prompt optimizations, or as an index for RAG-like tool selection approaches. To this end, <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.anthropic.com\/engineering\/multi-agent-research-system\">Anthropic recently reported using a tool testing agent<span class=\"sr-only\"> (opens in new tab)<\/span><\/a> to rewrite the prompts of defective MCP servers, improving task completion time by 40%. Likewise, rather than waiting for the protocol to evolve, clients could take proactive steps to resolve name collisions\u2014 for example, generating namespaces from server names\u2014and could reduce token outputs by summarizing or paginating long tool results.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"market-developers\">Market developers<\/h3>\n\n\n\n<p>Finally, we see an opportunity for marketplaces to codify best-practices, spot compatibility issues at a global level, and perhaps centralize the generation and serving of model or agent-specific optimizations. Mirroring how a market like PyPI <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/packaging.python.org\/en\/latest\/specifications\/platform-compatibility-tags\/\">distributes Python wheels matched to a developer\u2019s operating system or processor<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, an MCP marketplace could serve tool schemas optimized for a developer\u2019s chosen LLM, agent or client library. We are already seeing small steps in this direction, with registries like Smithery providing customized launch configurations to match users\u2019 clients.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"conclusion\">Conclusion<\/h2>\n\n\n\n<p>In summary, the MCP&nbsp;ecosystem offers significant value for AI agent development,&nbsp;despite&nbsp;some&nbsp;early&nbsp;growing pains.&nbsp;Grounded in insights from the&nbsp;<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/mcp-interviewer\" target=\"_blank\" rel=\"noopener noreferrer\">MCP Interviewer<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>&nbsp;and our survey of live servers, the evidence is clear: horizontal integration is expanding capability, yet it also exposes forms of toolspace interference that can erode end to end effectiveness. Anticipating rapid advances in model capability and growing architectural diversity, the recommendations provided here aim to ensure that protocol, server, client, and marketplace developers are&nbsp;well positioned&nbsp;to adapt and thrive. Key steps include implementing formal namespaces to&nbsp;eliminate&nbsp;collisions, enhancing protocol support for&nbsp;client provided&nbsp;resources, and encouraging transparent server documentation to foster interoperability and robust development practices across the ecosystem.&nbsp;<\/p>\n\n\n\n<p>By embracing these evergreen recommendations and proactively addressing compatibility, usability, and optimization issues, the AI agent community can create a more reliable, scalable, and efficient infrastructure that benefits both developers and end users. The future of MCP is bright, with ample opportunities for experimentation, standardization, and collective progress.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>As agentic AI ushers in a new era marked by tool expansion, systems are converging, and complexity is rising. Microsoft Research explores the Model Context Protocol (MCP) as a new standard for agent collaboration across fragmented tool ecosystems.<\/p>\n","protected":false},"author":43518,"featured_media":1149369,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"Adam Fourney","user_id":"30820"},{"type":"user_nicename","value":"Tyler Payne","user_id":"43967"},{"type":"user_nicename","value":"Maya Murad","user_id":"43879"},{"type":"user_nicename","value":"Saleema Amershi","user_id":"33505"}],"msr_hide_image_in_river":null,"footnotes":""},"categories":[1],"tags":[],"research-area":[13556],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[269148,243984,269142],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1149210","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-post-option-approved-for-river","msr-post-option-blog-homepage-featured","msr-post-option-include-in-river"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[992148],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Adam Fourney","user_id":30820,"display_name":"Adam Fourney","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/adamfo\/\" aria-label=\"Visit the profile page for Adam Fourney\">Adam Fourney<\/a>","is_active":false,"last_first":"Fourney, Adam","people_section":0,"alias":"adamfo"},{"type":"user_nicename","value":"Tyler Payne","user_id":43967,"display_name":"Tyler Payne","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tylerpayne\/\" aria-label=\"Visit the profile page for Tyler Payne\">Tyler Payne<\/a>","is_active":false,"last_first":"Payne, Tyler","people_section":0,"alias":"tylerpayne"},{"type":"user_nicename","value":"Maya Murad","user_id":43879,"display_name":"Maya Murad","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/mayamurad\/\" aria-label=\"Visit the profile page for Maya Murad\">Maya Murad<\/a>","is_active":false,"last_first":"Murad, Maya","people_section":0,"alias":"mayamurad"},{"type":"user_nicename","value":"Saleema Amershi","user_id":33505,"display_name":"Saleema Amershi","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/samershi\/\" aria-label=\"Visit the profile page for Saleema Amershi\">Saleema Amershi<\/a>","is_active":false,"last_first":"Amershi, Saleema","people_section":0,"alias":"samershi"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-960x540.jpg\" class=\"img-object-cover\" alt=\"Three white icons on a gradient background transitioning from blue to purple to pink. From left to right: a globe with a magnifying glass representing internet search, a central circle connected to smaller circles symbolizing network connectivity, and a checklist with two checkmarks and one empty box indicating task management.\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2025\/09\/ToolSpaceInterference-BlogHeroFeature-1400x788-1.jpg 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/adamfo\/\" title=\"Go to researcher profile for Adam Fourney\" aria-label=\"Go to researcher profile for Adam Fourney\" data-bi-type=\"byline author\" data-bi-cN=\"Adam Fourney\">Adam Fourney<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/tylerpayne\/\" title=\"Go to researcher profile for Tyler Payne\" aria-label=\"Go to researcher profile for Tyler Payne\" data-bi-type=\"byline author\" data-bi-cN=\"Tyler Payne\">Tyler Payne<\/a>, <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/mayamurad\/\" title=\"Go to researcher profile for Maya Murad\" aria-label=\"Go to researcher profile for Maya Murad\" data-bi-type=\"byline author\" data-bi-cN=\"Maya Murad\">Maya Murad<\/a>, and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/samershi\/\" title=\"Go to researcher profile for Saleema Amershi\" aria-label=\"Go to researcher profile for Saleema Amershi\" data-bi-type=\"byline author\" data-bi-cN=\"Saleema Amershi\">Saleema Amershi<\/a>","formattedDate":"September 11, 2025","formattedExcerpt":"As agentic AI ushers in a new era marked by tool expansion, systems are converging, and complexity is rising. Microsoft Research explores the Model Context Protocol (MCP) as a new standard for agent collaboration across fragmented tool ecosystems.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1149210","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/43518"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1149210"}],"version-history":[{"count":22,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1149210\/revisions"}],"predecessor-version":[{"id":1149620,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/1149210\/revisions\/1149620"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1149369"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1149210"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1149210"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1149210"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1149210"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1149210"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1149210"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1149210"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1149210"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1149210"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1149210"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1149210"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}