Protecting AI conversations at Microsoft with Model Context Protocol security and governance

|

We’re streamlining MCP governance through secure-by-default architecture, automation, and inventory to deliver a faster, safer agent development environment at Microsoft.

When we gave our Microsoft 365 Copilot agents a simple way to connect to tools and data with Model Context Protocol (MCP), the work spoke for itself.

Answers got sharper. Delivery sped up. New patterns of development patterns emerged across teams working with Copilot agents.

That ease of communication, however, comes with a responsibility: Protect the conversation.

Questions came up like, who’s allowed to speak? What can they say? And what should never leave the room?

Microsoft Digital, the company’s IT organization, and the Chief Information Security Officer (CISO) team, our internal security organization, are leaning on those questions to help us shape our strategy and tooling around MCP internally at Microsoft.

A photo of Kumar.

“With MCP, the problem is not the inherent design; it’s that every improper server implementation becomes a potential vulnerability. Even one misconfigured server can give the AI the keys to your data”

Swetha Kumar, security assurance engineer, Microsoft CISO

Our approach is intentionally straightforward.

Start secure by default. Use trusted servers. Keep a living catalog so we always know which voices are in the room. Shape how agents communicate by requiring consent before making changes.

We minimize what’s shared outside our walls, watch for drift, and act when something looks off. Our goal is practical governance that lets builders move fast while keeping our data safe.

That’s the risk we design for, and it’s why our controls prioritize clear ownership, simple choices, and visible guardrails.

“With MCP, the problem is not the inherent design; it’s that every improper server implementation becomes a potential vulnerability,” says Swetha Kumar, a security assurance engineer in the Microsoft CISO organization. “Even one misconfigured server can give the AI the keys to your data.”

Understanding MCP and the need for security

MCP is a simple standard that lets AI systems “talk” to the right tools and data without custom integration work. Think of it like USB‑C for AI. Instead of building a new connection every time, teams plug into a common pattern. That standardization delivers speed and flexibility—but it also changes the security equation.

Before MCP, every integration was its own isolated conversation.

“Now, one pattern can unlock many systems,” Kumar says. “It’s a win and a risk. When AI can reach more systems with less effort, we must be precise about who’s allowed to speak, what they can say, and how much gets shared.”

We frame this as communications security.

The question isn’t just, “Is this API secure?” It’s “Is this a conversation we trust?” We want to know which servers are in the room, what actions they’re permitted to take, and how we’ll notice if something changes. At the same time, we keep the cognitive load low for builders. They choose from trusted options, see clear prompts before an agent makes edits, and move on. Simple choices lead to safer outcomes.

“MCP enables granular control over the tools and resources exposed to the Large Language Model,” Kumar says. “But that means the developer is responsible for configuring it correctly—which tools an agent can see, what actions a server can take, and what context is shared.”

This approach helps both sides.

Product teams get a consistent way to extend their agents while security teams get consistent places to add guardrails—at discovery, access, and throughout the flow of requests and responses. Everyone operates from the same playbook.

When we treat MCP this way, we protect the conversation without slowing it down. We know who’s speaking. We know what they can do. And we can prove it.

Assessing MCP security across four layers

Every MCP session creates a conversation graph. An agent discovers a server, ingests its tool descriptions, adds credentials and context, and starts sending requests. Each step—metadata, identity, content, and code—introduces potential risk.

We evaluate those risks across four layers so we can catch failures early, contain blast radius, and keep conversations in bounds.

However, the big picture is just as important as the details.

“We take a holistic view of MCP security: start with the ecosystem, then specify controls across the four layers,” Kumar says. “The layers make the work concrete, but the goal stays the same—unified governance, shared education, and faster detect-and-mitigate when a server is at risk.”

Applications and agents layer

This is where user intent meets execution. Agents parse prompts, discover tools, select actions, and request changes. MCP clients live here, deciding which servers to trust and when to ask for user consent.

  • What can go wrong
    • Tool poisoning or shadowing. A server advertises safe‑looking actions but performs something else.
    • Silent swaps. A tool’s metadata changes and the client keeps trusting an altered “voice.”
    • No sandbox. The agent can request edits or run code without strong guardrails.
  • What we watch for
    • Unexpected tool descriptions or capabilities at connect time.
    • Edit attempts on critical resources without explicit user consent.
    • Abnormal tool‑selection patterns across sessions.

AI platform layer

The AI platform layer includes the AI models and runtimes that interpret prompts and call tools, along with orchestration logic and safety features.

  • What can go wrong
    • Model supply‑chain drift. Unvetted models, unsafe updates, or compromised fine‑tunes change behavior.
    • Prompt injection via tool text. Descriptions and responses steer the model toward unsafe actions.
  • What we watch for
    • Model provenance and update cadence tied to agent behavior changes.
    • Signals of jailbreaks or instruction overrides in prompts and intermediate messages.
    • Output drift linked to specific tools or servers.

Data layer

This layer covers business data, files, and secrets the conversation can touch.

  • What can go wrong
    • Context oversharing. Session data, files, or secrets get packed into the model’s context and leak to a third‑party server.
    • Over‑scoped credentials. Long‑lived tokens, broad scopes, or wrong audience claims enable lateral movement.
  • What we watch for
    • Size and sensitivity of context passed to tools.
    • Token hygiene, including short lifetimes, least‑privilege scopes, and correct audience claims.
    • Data egress patterns that don’t match a tool’s declared purpose.

Infrastructure layer

The infrastructure layer includes compute, network, and runtime environments.

  • What can go wrong
    • Local servers with too much reach. Excessive access to environment variables, file systems, or system processes.
    • Cloud endpoints without a gateway. No TLS enforcement, rate limiting, or centralized logging.
    • Open egress. Servers call out to the internet where they shouldn’t.
  • What we watch for
    • All remote MCP servers registered behind the API gateway.
    • Runtime signals, such as authentication failures, burst traffic, or unusual geographies.
    • Network policies that restrict outbound calls to certain targets.

Across all four layers, the throughline is AI communications security. We decide who can speak and verify what was said—and keep listening for change.

Establishing a secure-by-default strategy

We start by closing the front door. We recommend every remote MCP server sits behind our API gateway, giving us a single place to authenticate, authorize, rate‑limit, and log. There are no direct calls and no blind spots.

A photo of Enjeti

“Everything we do starts with securing the MCP server by default and that begins by registering it in API Center for easier discovery. We rely solely on vetted and attested MCP servers, ensuring every call comes from a trusted footprint.”

Prathiba Enjeti, principal PM manager, Microsoft CISO

Next, we decide who gets a voice.

Teams choose from a vetted list of MCP servers. If someone connects to an unapproved endpoint, they receive a friendly nudge and a clear path to register it. No shaming—just fast correction and a better inventory the next time around.

Identity comes next. Servers expect short‑lived, least‑privilege tokens with the right scopes and audience. Admin paths require strong authentication, and where possible, we use proof‑of‑possession to bind tokens to the client and reduce replay risk. Secrets don’t live in code, keys rotate, and audit trails are in place.

“Everything we do starts with making the MCP server secure by default and that begins by registering it in API Center for easier discovery,” says Prathiba Enjeti, a principal product manager in the Microsoft CISO organization. “We only use vetted and attested MCP servers. That’s how we keep the conversation safe without slowing it down.“

On the client side, we slow agents at the right moments. Agents can’t touch high‑risk tools without explicit consent. Tool descriptions are verified on connection and compared to approved contracts. If a tool’s “voice” drifts, we block the call.

We also minimize what’s shared.

Context is trimmed to what the task requires. Sensitive data isn’t included by default, and third‑party servers get only what they need—not the whole transcript. Output filters and prompt shields sit alongside the model to prevent risky inputs from becoming risky actions.

Isolation completes the design. Local servers run in containers with tight file and network permissions. Hosted servers allow only the outbound calls they need, and inbound traffic flows through the gateway, with TLS and logging enforced.

Simple rules with visible guardrails.

“We only use vetted MCP servers,” Enjeti says. “That’s how we keep the conversation safe without slowing it down.”

How we run MCP at scale: architecture, vetting, and inventory

We keep MCP safe by making three things intentionally boring: architecture, vetting, and inventory. One defined path. One vetting flow. One living catalog.

Architecture

We recommend remote MCP servers sit behind an API gateway, giving us a single place to authenticate, authorize, validate, rate‑limit, and log. Transport Layer Security (TLS) is required by default, and for sensitive endpoints, we can require mutual TLS. Outbound egress is pinned to approved destinations using private endpoints and firewall rules, so servers can’t “call anywhere.” Runtime protection continuously watches for credential abuse, injection patterns, burst traffic, and odd geographies.

Identity is established up front. We issue short‑lived, least‑privilege tokens with the correct audience and scopes, and admin paths require strong authentication. Where supported, tokens are bound to the client to reduce replay risk. Services use managed identities or signed credentials; secrets don’t live in code, and keys rotate on schedule.

Model‑side safety travels with every conversation. Content safety and prompt shields help models ignore risky inputs, while orchestration enforces a per‑tool allowlist, so an agent can’t call tools that aren’t in policy—even if the model suggests it. We also track model versions, allowing behavior changes to be correlated with updates.

Clients enforce consent at the edge. “Ask before edits” is enabled by default for write, delete, and configuration changes. When an agent connects, it verifies tool descriptions against the approved contract.

Observability ties it all together. We’re working toward logging tool calls, resource access, and authorization decisions end‑to‑end with correlation IDs. Detections flag abnormal tool selection, unexpected data egress, or edits without consent. Every server has an owner, a contract, and an approval record, and metadata changes automatically trigger re‑review. Kill switches live at both the client and the gateway when we need them.

Vetting

We don’t “connect and hope.”

Before any MCP server can speak in our environment, it earns trust. Owners declare what the server does (tools and actions), what it touches (data categories and exports), how callers authenticate (scopes and audience), and where it runs (runtime and on‑call ownership).

We start with static checks: manifests must match the contract, side‑effecting actions must be consent‑gated, tokens must be short‑lived and properly scoped. A SBOM (Software Bill of Materials) must be present, dependencies must be current, and no credentials can be embedded in code.

Then we test like a client would. We snapshot tool metadata on connect and compare it to the approved contract, probe for prompt‑injection and tool‑poisoning, and verify that “ask before edits” triggers for destructive actions.

We also confirm context minimization, validate that egress is pinned to approved hosts, and test resilience under load, including health checks, retry behavior, and isolation using containers with least‑privilege file and network access. Servers are published only when security, privacy, and responsible AI reviews are complete, runbooks and on‑call are in place, and the registry entry is created and pinned.

Inventory

A photo of Janardhanan

“Inventory is the foundation—if we miss a server, we miss the conversation. Every server, regardless of where it’s running or how it’s deployed, must be accounted for in our system.”

Priya Janardhanan, principal security assurance engineering manager, Microsoft CISO

You can’t govern what you can’t see, and MCP shows up in more places than a single system of record. To solve that, we’re building the map from signals and stitch them into one catalog.

“Inventory is the foundation—if we miss a server, we miss the conversation,” says Priya Janardhanan, a principal security assurance engineering manager at Microsoft CISO Operations. “Every server, regardless of where it’s running or how it’s deployed, must be accounted for in our system. Without a complete inventory, we lose visibility into critical operations, risk exposing sensitive data, and undermine our ability to ensure compliance and security.”

Our goal state is that Endpoint telemetry catches developer‑run servers on laptops and workstations. Repos and CI pipelines reveal intent before anything ships. IDEs (Integrated Development Environments) surface local extensions and configured endpoints. The gateway and our registries anchor what’s approved for business data, while low‑code environments tell us which connectors are in use and where they point.

We normalize and correlate those signals with stable IDs for servers, tools, and owners. Ownership is proven through repositories, gateway services, and environment administrators—on‑call contacts included. Exposure is scored based on data touches, scopes requested, egress rules, and change history, so high‑risk items rise to the top of the queue.

Freshness is tracked with last‑seen timestamps, and stale entries are retired over time. Builders can discover and reuse approved servers; reviewers can see what changed since the last approval, and admins get instant visibility into coverage and hotspots.

We’re working toward automated identification and notification for unknow servers. In the ideal state, a registration stub is created when we detect an unknown server on an endpoint. Then, the likely owner is notified, and direct calls are blocked until the server is vetted through an automated process. If tool metadata changes after approval, high-risk actions are paused and routed for re-review, then auto-resumed once approved.

“It all revolves around inventory as the foundation,” Janardhanan says. “If we miss a server, we miss the conversation.”

A photo of Hasan

“Agent 365 tooling servers will allow centralized governance for IT admins. That means a single pane where they can see what’s approved, who owns it, what data it touches, and then apply policy.”

Aisha Hasan, principal product manager, Microsoft Digital

Architecture gives us stable choke points. Vetting keeps weak servers out. Inventory keeps our map current. It’s a single pattern for builders and a unified playbook for security.

Governing agents in low‑code and pro-code scenarios

Makers move fast—that’s the point. A Customer Support team needed a Copilot action to pull case history, so they opened Copilot Studio, selected an approved MCP connector, and shipped a first version before lunch. No tickets. No detours. Governance showed up in the flow, not as a blocker.

“Agent 365 tooling servers will allow centralized governance for IT admins,” says Aisha Hasan, a principal product manager at Microsoft Digital. “That means a single pane where they can see what’s approved, who owns it, what data it touches, and then apply policy. We’re moving toward that consolidation so innovation continues while governance gets simpler and more consistent.”

We place guardrails where makers already work. In Copilot Studio, trusted and verified first-party MCP servers are allowed in developer environments to accelerate innovation and encourage experimentation. Riskier or complex MCP integration is available in Copilot Studio custom environments and other pro-code tools such as Microsoft 365 Agent Tool kit in VS Code and Microsoft Foundry, but only with clear checks: service ownership, security and privacy review, responsible AI assessment, and consent gating for high‑impact actions.

The allowlist is our north star.

Approved MCP servers and connectors live in one catalog with documented owners, scopes, and data boundaries. Makers choose from that shelf. If an MCP server uses an unverified tool, we enforce endpoint filtering. If there is misconfiguration, we open a task for the owner and help them build securely.

Permissions stay tight without adding cognitive load. Tokens are short‑lived and scoped to the task. Context is trimmed so only the necessary fields flow to the tool. Third‑party servers never get the full transcript. If a connector’s capabilities change, the runtime compares the new “voice” to what we approved. MCP Clients should pause risky actions, notify the owner, and resume automatically once reviewed.

With agent inventory in Power Platform Admin Center and registry in Agent 365, admins get a clean view on which connectors are active, who owns them, what data they touch, and how often they’re called. Organization policies such as DLP and MIP can be enforced in a unified way , with a re‑review when capabilities change. The goal is simple: let builders innovate confidently and securely while maintaining security and compliance.

“MCP servers are powerful AI tools that enable agents to seamlessly integrate and interact with enterprise data and transform business workflows,” Hasan says. “That means the same enterprise data and governance principles are applied equally to MCP servers and other connectors. A robust inventory, an agile policy framework, and an automated workflow for enforcement are cornerstones for successfully governing agents at scale.”

Securing MCP at scale: Operating, monitoring, and enabling

Our work doesn’t stop at go‑live. Once an MCP server is in the catalog, we operate the conversation like a service: measurable, observable, and responsive. Identity and policy guard the front door, but runtime is where we prove the controls work without slowing anyone down.

In practice, operating MCP at scale comes down to four motions:

Observe every tool call end to end. We make the flow observable. Every tool call carries a correlation ID from client to gateway to server and back. Prompts, tool selections, authorization decisions, and resource access should belogged with consistent schemas. Golden signals—latency, errors, saturation—sit alongside safety signals like unexpected egress or edits without consent. Owners and security teams see the same dashboards.

Detect drift and abnormal behavior early. Detection lives close to the work. We flag abnormal tool patterns, spikes in write operations, burst traffic from new geographies, and context sizes that don’t fit a task. We continuously compare a tool’s “voice” at connect time to the approved version; drift automatically pauses risky actions and pings the owner. Cost controls double as guardrails, using rate limits and budgets to cap blast radius and surface runaway loops early.

Respond with precision instead of blunt shutdowns. Response is graded, not binary. We can block destructive actions and allow reads, or throttle a noisy client without killing the session. Kill switches exist at both the client and the gateway. Playbooks are pre‑approved and integrated into the consoles owners already use, and dry runs are part of muscle memory, so the first switch flip doesn’t happen during an incident.

We treat model behavior as part of operations. Content safety and prompt shields run in production, not just in tests. We pin model versions and watch for output drift after updates. If a model starts suggesting tools out of character, the owner gets paged with the exact prompts and calls that triggered it.

Telemetry respects privacy. Logs avoid sensitive payloads by default and mask what must pass through for forensics. Access is role‑based, retention follows policy, and audit readiness is designed in on day one.

Enable builders through templates, education, and reuse. Adoption and education run in parallel. Builders get templates that enable best practices: sample manifests with consent gates, CI checks for token scope and SBOMs, and gateway stubs with sane defaults. A “ten‑minute preflight” runs locally to verify contracts, test consent flows, and check egress before a pull request is opened. IDE lint rules catch common issues early.

“This is how we operate MCP at scale,” says Janardhanan. “Observe the conversation, detect drift early, respond with precision, and teach habits that make the right path the easy path. We run it like a product because that’s what it is.”

Measuring results and moving forward

This program has changed how we build. Reviews move faster because every server follows the same path. Drift is caught early because clients compare a tool’s “voice” on connection. Shadow servers decline as inventory fills in from endpoint, repo, IDE, and gateway signals. Reuse increases because teams can discover trusted servers instead of creating new ones. Incidents resolve faster with correlation IDs across the conversation and kill switches at both the client and the gateway.

It’s also changed how our admins work. One gateway means one perimeter to manage. Policies land once and apply everywhere. Owners see the same telemetry security sees, so fixes happen where the work happens.

Going forward, we’re focused on more consolidation and automation. We’re moving toward a single pane for MCP governance—approve, monitor, and pause from one place. Policy-as-code will keep allowlists, consent rules, and egress boundaries versioned and testable in CI.

Our preflight checks will get smarter, with stronger injection tests, automatic egress validation, and environment‑aware templates. We’ll expand consent patterns so high‑impact actions remain explicit and auditable, even across multi‑tool chains. And we’ll keep shrinking re‑review time, so drift is measured in minutes, not days.

AI conversations are now part of how we build every day. MCP standardizes how agents talk to tools and data. Secure‑by‑default architecture, rigorous vetting, and a living inventory, ensure the right voices stay in the room, only what’s needed is shared, and drift is caught early.

The result is simple: teams ship faster with fewer surprises, and governance stays visible without getting in the way. We’ll keep tightening the loop, so saying yes remains both easy and safe.

Key takeaways

If you’re implementing MCP security, consider these key actions to ensure secure, efficient adoption in your organization:

  • Build governance into the maker flow. Embed security, consent, and responsible AI checks directly where teams build—so protection shows up by default, not as an afterthought.
  • Maintain a single allowlist and catalog. Centralize approved MCP servers and connectors with clear ownership, scope, and data boundaries.
  • Enforce scoped, short-lived permissions by default. Automatically limit token scope and duration to minimize risk and exposure.
  • Monitor continuously and detect drift early. Observe activity, flag deviations, and pause risky actions until reviewed and approved by owners.
  • Automate incident response and controls. Leverage pre-approved playbooks, kill switches, and rate limits for fast, precise action.
  • Design for privacy and auditability from day one. Mask sensitive data, restrict log access by role, and endure audit readiness.
  • Promote education and reuse. Provide templates, training, and feedback loops to encourage safe development and adoption of trusted servers.

Recent