Read our seven tips for shifting to a ‘cloud native’ device management strategy

|

We’re using Microsoft Intune, Microsoft 365 Copilot and other AI tools to modernize device management internally here at Microsoft, a shift that is enabling us to reduce our workloads and to speed remediation at scale.

At Microsoft, we manage a large, diverse device estate, with more than 1 million devices in use by employees and teams across our global corporate network.

For years, we stitched together insights across multiple tools, wrote custom queries, and maintained fragile reports just to answer basic questions. This approach slowed investigations and delayed patch targeting.

We needed a faster, stronger, cloud-native path.

We’re investing in AI-powered predictive maintenance and intelligent troubleshooting to reduce friction in device management.”

Daniel Manalo, principal service engineer, Microsoft Digital

The advent of generative AI changed the way we manage our devices. Not only were we able to ask better questions and get targeted help right from the start, we also got faster and more relevant answers from across our entire device management estate.

It’s simpler. It’s faster. It scales with our environment. And we’re doing it natively in the cloud.

“We’re investing in AI-powered predictive maintenance and intelligent troubleshooting to reduce friction in device management,” says Daniel Manalo, a principal service engineer in Microsoft Digital, the company’s IT organization.

AI and machine learning help us find errors faster and fix them autonomously, in many cases. It reduces our downtime, prolongs lifespans of our devices, and ensures our employees have a consistent and productive experience with their devices.

Today, we’re applying this approach to everyday operations: Speeding investigations, simplifying updates, and tightening the loop from detection to remediation. The overarching goal remains consistent—reduce workloads, improve clarity, and move our discoveries to earlier in the risk window.

The role of Customer Zero in evolving modern device management

We serve as the company’s Customer Zero for our products here in Microsoft Digital. We run early capabilities in our own tenant, pressure‑test them at Microsoft scale, and feed what we learn straight back to engineering. The goal is simple: Turn good ideas into reliable features that any enterprise can use.

A photo of Selvaraj.

“We use our collective learnings from our internal deployments to improve our products, which makes them better for our employees and for our customers.”

 Senthil Selvaraj, principal group product manager, Microsoft Digital

Our Microsoft Digital teams work side-by-side with the Intune product group to modernize our device management approach. The Intune group builds and operates the platform, while we bring real‑world scenarios, signals, and guardrails. Together, we help develop, test, and deploy a better cloud-native product for our customers.

“We use our collective learnings from our internal deployments to improve our products, which makes them better for our employees and for our customers,” says Senthil Selvaraj, a principal group product manager in Microsoft Digital.

For the same reasons, we work hard to make sure that we deploy our tools and services in the same way our customers do.

“That enables everyone at the company to have good visibility into the experiences our customers will have when our products get to them,” Selvaraj says. “This makes us more accountable to our customers and helps us move quickly when improvements are needed.”

Customer Zero for device management spans more than Intune.

We partner across teams responsible for Microsoft Purview, Microsoft 365 Copilot, Microsoft Defender, Windows (Autopatch and Hotpatch), GitHub, and Microsoft Azure to produce comprehensive device management capabilities. These are the surfaces where we test, learn, and refine the end‑to‑end device management experience.

The loop is tight. We identify a need, prototype a solution with the product groups, roll it out to targeted rings, measure impact, and iterate. Those learnings inform what ships in Intune—from data-driven insights to built‑in prompts that surface device health data as a conversation, rather than a simple query.

“Using natural language reduces the time it takes us to figure out what’s going on. We are able to ask Security Copilot questions naturally, which allows us to hear the signals that need our immediate action faster.”

Mohit Malhotra, product manager, Microsoft Digital

The result is a safer, faster path to value with AI-driven device management, including clear ownership, faster remediation, and features that arrive tested against operational reality.

We’ve learned a lot as Customer Zero, and we’re passing those lessons on to you.

Modern device management: Seven tips

Here are seven important tips that we’ve compiled to help with your device management efforts.

Tip 1: Ask natural-language questions with Microsoft Security Copilot

We use the generative AI capabilities in Microsoft Security Copilot to query device and vulnerability data in plain language and get a unified answer that we can act on.

This allowed us to replace bespoke reports with targeted questions.

“Using natural language reduces the time it takes us to figure out what’s going on,” says Mohit Malhotra, a product manager in Microsoft Digital. “We are able to ask Security Copilot questions naturally, which allows us to hear the signals that need our immediate action faster.”

Security Copilot lets us ask about device posture, app versions, cybersecurity vulnerabilities (known as Common Vulnerabilities and Exposures, or CVEs), and exposure across Microsoft Defender and Intune, without stitching the data together by hand. We get the context we need and move faster from finding to fixing.

How we use it

  • Scope impact: “List Windows devices running <app/version> that are vulnerable, with owners and deployment rings.”
  • Prioritize work: “Group affected devices by business unit and model; show counts and severity.”
  • Verify reach: “Confirm which devices received <policy/package> in the last 48 hours; flag failures.”

Prompts we rely on

  • “Show devices affected by <CVE/app version> and summarize recommended remediation steps.”
  • “Break down exposure by ring and list top 5 models with highest risk.”
  • “Identify outliers that failed the last policy sync and provide reasons.”

Why it helps

  • Less toil: No custom pipelines to maintain.
  • Faster triage: Discovery and scoping happen in one interaction.
  • Clear next steps: Results align to our Intune targeting and scheduling paths.

Best practices

  • Start specific: Name the product, version, and time window, then broaden as needed.
  • Keep follow‑ups short: Quick pivots like “group by region” or “add owner emails” maintain momentum.
  • Act on the output: Use the device lists to target updates or policies in Intune, then validate results with a final check.

Note

  • We align usage with least‑privilege access and established approval paths so insights come from authoritative sources and actions land through the right channel.

Tip 2: Find knowledge fast with Microsoft 365 Copilot

We use Microsoft 365 Copilot to pull device context from email, chats, and documents, allowing us to troubleshoot issues faster and easier using generative AI.

Incidents start with questions, not dashboards, e.g. “Who owns this package? When did we change that policy? Where did we discuss the driver rollback?”

The answers to those questions live in mail threads, Teams chats, and planning docs. Before Copilot, we were forced to sift through these materials manually, which cost us time. Now we ask one question and get a summary with sources, people, and links. That keeps the investigation moving and reduces handoffs.

A photo of Griswold.

“Copilot helps scan noisy logs and points us to likely causes. Our old process of opening logs, interpreting opaque error strings, and validating a hunch took too long. Getting faster answers matters when incidents stack up.”

Michael Griswold, principal service engineering manager, Microsoft Intune

This also helps us during the coordination phase. We can surface the approver for a change, the engineer who ran the last mitigation, and the runbook section that explains the rollback steps. We make better decisions because we see the history and the intent, not just the current state. Then we line up the action in Intune with the right stakeholders already looped in.

How we use it

  • Asking for recent context on a device model, configuration, or app to see decisions and outcomes in one place.
  • Retrieving owners, approvers, and on‑call contacts named in Outlook and Teams messages related to the issue.
  • Pulling change notes and runbook updates tied to a policy or package before we request an update in Intune.

Prompts we rely on

  • “Summarize recent emails and Teams messages about <device model/app version> and list owners mentioned.”
  • “Find the change note or runbook update for <policy/package> from the last 14 days.”
  • “Show known issues linked to <KB/app> and who resolved the last occurrence.”

Why it helps

  • Less hunting: We replace ad hoc inbox and wiki searches with a single query.
  • Faster coordination: We identify the right stakeholders and prior decisions immediately.
  • Better decisions: We confirm history and context before proposing changes in Intune.

Best practices

  • Keep prompts scoped. Include product, version, and a timeframe to focus your results.
  • Respect boundaries. Align usage with least‑privilege access and existing approval and auditing paths.
  • Capture outcomes. Link summaries, owners, and key docs back to the incident record so future searches return richer context.

Note

  • Copilot gets better as more decisions and runbooks live in Microsoft 365, since that’s where the signals come from.

Tip 3: Accelerate log triage with GitHub Copilot, Visual Studio Code, and Log Analytics

We use GitHub Copilot in Visual Studio Code with Azure Monitor Log Analytics to explain errors, draft KQL, and shorten device log investigations.

“Copilot helps scan noisy logs and points us to likely causes,” says Michael Griswold, a principal service engineering manager with the Microsoft Intune product group. “Our old process of opening logs, interpreting opaque error strings, and validating a hunch took too long. Getting faster answers matters when incidents stack up.”

Now we keep the entire loop in one workspace. AI in GitHub Copilot interprets the event, proposes likely causes, and generates KQL to confirm or rule out scenarios. We move from symptom to validated pattern without bouncing across tools.

How we use it

  • Connect VS Code to your Log Analytics workspace and load the tables you need (e.g., inventory and update events).
  • Paste a minimal log sample with timestamps and device identifiers, so Copilot has context.
  • Ask Copilot to summarize the error, suggest probable causes, and produce KQL to test each path.
  • Run the query, review clusters and outliers, and request an alternate query or grouping if noise is high.

Prompts we rely on

  • “Explain this error in a device‑management context and list three validation checks.”
  • “Write KQL to find matching failures in the last 24 hours and group by model and policy.”
  • “Join device inventory with update events for device and surface anomalies.”

Why it helps

  • Faster pattern recognition: Proposed queries get us to evidence quickly.
  • Less context switching: Analysis and validation happen inside VS Code.
  • Cleaner handoff: Results map to our Intune actions for targeted remediation.

Best practices

  • Keep inputs tight: Provide a small, representative log snippet, the affected device attributes, and a precise time window.
  • Iterate on queries: Ask for different filters, joins, or time ranges when results are noisy.
  • Close the loop: Use the device list to drive policy or update changes in Intune and confirm fixes with a final query.

Note

  • This workflow is broadly repeatable with GitHub Copilot, Visual Studio Code, and Azure Monitor Log Analytics.

Tip 4: Keep firmware and drivers current with Intune update management

We use Intune firmware and driver update management to identify, approve, and deploy our OEM updates at scale.

“Staying current on firmware and drivers keeps devices stable and secure. With Intune, we stage updates, watch the rollout, and adjust before issues spread.”

Taqui Mohammad, senior service engineer, Microsoft Digital

Firmware and driver releases don’t land on a predictable schedule. Different vendors ship on different timelines, and a single environment can span hundreds of models.

Tracking this manually slows responses and leaves risk on the table. Intune centralizes the view so we can see what’s applicable, choose the right targets, and roll out updates with the same discipline we use for OS patches.

“Staying current on firmware and drivers keeps devices stable and secure,” says Taqui Mohammad, a senior service engineer in Microsoft Digital. “With Intune, we stage updates, watch the rollout, and adjust before issues spread.”

How we use it

  • Review applicability: Open the firmware and driver updates view to see available updates grouped by make and model.
  • Select a pilot: Target a small ring first (model, business unit, or region) and set short deadlines.
  • Plan time windows and restarts: Align deployments with maintenance windows and communicate expected reboots.
  • Monitor, then expand: Track success and failure signals, remediate issues, and scale to broader rings.

Configuration tips

  • Standardize categories: Separate firmware from drivers in policies so reporting and rollbacks are clean.
  • Use device tags consistently: Model, region, and business unit tags make scoping and expansion straightforward.
  • Define rollback steps: Document how to revert a driver or hold firmware for a specific model when needed.

Success checks

  • Compliance trend: Increased percentage of devices on the latest approved firmware and driver versions after each wave.
  • Incident correlation: Fewer support tickets related to device stability and peripherals on updated models.
  • Deployment reliability: Decreased failure rates as pilots catch issues before broad rollout.

Best practices

  • Pair with risk signals: Prioritize models tied to active vulnerabilities or incident clusters before broad rollout.
  • Keep rings small and fast: Validate quickly, then scale; long pilots hide issues and delay benefits.
  • Document exceptions: If a model needs a temporary hold due to app or peripheral compatibility, record the reason and set a review date.
  • Verify outcomes: Confirm update levels on target devices and scan for regressions in support queues.

Notes

  • Expect uneven arrival patterns across vendors and models; a weekly review cadence helps catch new updates without creating noise.
  • Treat firmware and drivers as first‑class updates; include them in regular compliance reports and reviews so they get consistent attention.
A photo of Rodriguez.

“Autopatch Update Readiness catches and resolves common blockers before deployment begins. What used to require manual checks and troubleshooting is now handled upfront, giving us smoother updates and a far more reliable experience for our employees.”

Dave Rodriguez, principal product manager, Microsoft Digital

Tip 5: Speed updates with Windows Autopatch, Hotpatch, and Auto Remediation Update Readiness

We use Windows Autopatch and Hotpatch to reduce disruptions and keep our devices current, and we pair them with automated readiness and remediation so our changes land safely and quickly.

Autopatch handles orchestration for quality updates and feature releases. We define rings that reflect business risk and user impact, then let the service pace deployments as health signals arrive.

“Autopatch Update Readiness catches and resolves common blockers before deployment begins,” says Dave Rodriguez, a principal product manager in Microsoft Digital. “What used to require manual checks and troubleshooting is now handled upfront, giving us smoother updates and a far more reliable experience for our employees.”

Where Hotpatch is available, we apply security updates without a reboot, which cuts downtime and helps us move faster on critical fixes. An automated readiness layer checks prerequisites, fixes common blockers, and confirms that devices are ready before rollout.

How we use it

  • Enroll eligible devices in Autopatch and map them to the right scope so ownership, reporting, and break‑glass procedures are clear.
  • Build rings that reflect business priority and user profiles (e.g., VIP laptops, frontline kiosks, engineering workstations, and lab devices).
  • Enable Hotpatch on supported SKUs and confirm policy alignment so security updates apply without restarts where possible.
  • Run readiness checks that verify update agent health, policy state, storage and battery requirements, VPN reachability, and available maintenance windows.
  • Auto‑remediate common blockers such as stale update caches, missing prerequisites, paused services, or conflicting policies before a device enters the next ring.
  • Start with small cohorts, monitor early signals like install rate and post‑update stability, validate rollback paths, then expand the scope deliberately.

Operational checks

  • Ring coverage ensures eligible devices are actually assigned to a ring and not stranded outside the managed flow.
  • App and driver smoke tests validate business‑critical apps, kernel drivers, and peripherals on pilot cohorts before broad rollout.
  • Safeguard holds and known‑issue tracking are able to watch for vendor or service flags, which can pause or throttle a ring until a fix is available.
  • Rollback readiness confirms who owns the decision, what steps they follow, and how telemetry proves the rollback succeeded on affected devices.

Why it helps

  • Continuous movement shortens exposure windows because healthy rings advance without waiting for a fixed date.
  • Fewer interruptions improve user experience, as Hotpatch removes the need for restarts on supported devices.
  • Higher success rates come from automated readiness and remediation, removing predictable failures before deployment.

Best practices

  • Use consistent device tags so rings map cleanly to models, regions, and business units, which keeps targeting and reporting trustworthy.
  • Keep pilots small and fast to find issues quickly, then scale once success criteria are met and rollback is validated.
  • Communicate maintenance expectations in plain language so users know timing, restart behavior, and how to report problems.
  • Pace by risk rather than calendar, advancing rings when health metrics and support signal quality are within thresholds.
  • Review deployment dashboards daily during rollout, adjust ring size or cadence when error rates rise, and capture lessons learned for the next wave.

Note

  • Hotpatch availability depends on your Windows edition and configuration, so confirm support and prerequisites as part of your scoping work.

Tip 6: Keep third‑party apps current with Intune Enterprise App Management

We use Intune Enterprise App Management to keep third‑party apps current without constant packaging work.

A photo of Arias.

“Third-party apps fall out of date fast, so we’re standardizing how they’re updated. We do that with Enterprise App Management, which gives us reliable packages and keeps us moving at a steady cadence.”

Humberto Arias, senior product manager, Microsoft Digital

Third‑party software drives real risk: version drift, silent installers change, and manual packaging pipelines break at the worst time.

With Enterprise App Management, we select from a managed catalog, set assignment and update rules, and let the service handle new versions as they ship. We spend our time on exceptions, not routine updates.

“Third-party apps fall out of date fast, so we’re standardizing how they’re updated,” says Humberto Arias, a senior product manager in Microsoft Digital. “We do that with Enterprise App Management, which gives us reliable packages and keeps us moving at a steady cadence.”

This approach also improves the user experience. Updates arrive in predictable windows and dependencies are handled in a timely manner. We avoid surprise prompts and failed installs that generate tickets. When we do need to pause or pin a version, we scope it cleanly and document the reason.

How we use it

  • Build a standard catalog that covers the common apps our users need and assign clear ownership for each title.
  • Configure update behavior to auto‑update.
  • Use rollout rings so pilots validate the installation success rate and app behavior before expanding to broad audiences.
  • Scope assignments with device tags such as model, region, or business unit to simplify targeting and reporting.
  • Monitor install and update status, investigate failures, and retry with adjusted timing or requirements when needed.
  • Capture exceptions for apps that need holds or custom steps and set review dates to revisit the decision.

Scenarios we run

  • Rapid response when a high‑risk CVE drops by prioritizing affected apps and moving them to the front of the update queue.
  • Version cleanup by removing outdated or duplicate installers so devices converge on a single approved release.
  • Conditional deployment for specialized teams by offering an app as available instead of required while still tracking adoption.

Why it helps

  • Less packaging toil because the catalog supplies current installers and metadata.
  • Faster patching for common apps because updates flow as they publish.
  • Better compliance reporting because versions and assignments are consistent across rings and groups.

Best practices

  • Keep an authoritative list of approved apps with owners, support notes, and rollback steps.
  • Coordinate maintenance windows for high‑impact apps so users can save work before enforced updates.
  • Require pilots for any app with add‑ins or drivers and validate workflows with real users before scaling.
  • Use uninstall assignments to remove unapproved or vulnerable software and block reinstallation where needed.
  • Document app‑level exceptions, including the rationale and a date to re‑evaluate.

Notes

  • Some apps need pre-install checks or post-install steps, so include scripts or detection rules where required.
  • Track license terms and usage for commercial titles so updates do not outpace entitlements.

Tip 7: Close the loop with Defender Vulnerability Management and Intune security tasks

We use Microsoft Defender Vulnerability Management with Intune to turn exposure insights into targeted actions that close risk fast.

“The Intune Vulnerability Agent gives us a clear list of issues by device and owner. It shortens our path from finding a problem to fixing it.”

Harshitha Digumarthi, senior product manager, Microsoft Digital

Incidents don’t end when we spot a CVE. They end when devices are fixed and verified.

Vulnerability Management gives us an AI-powered live inventory of devices, software, and configurations, then connects that inventory to known threats. It shows which versions run where, highlights misconfigurations, and explains why a device is at risk. We see the problem and the cause, not just a risk score.

“The Intune Vulnerability Agent gives us a clear list of issues by device and owner,” says Harshitha Digumarthi, a senior product manager at Microsoft Digital. “It shortens our path from finding a problem to fixing it.”

It also ranks what to fix first. Factors like severity level, exploit availability, active attacks, and business context all feed into the priority list, so that commensurate effort goes where it’s needed most. The service recommends specific actions such as updating, uninstalling, reconfiguring, or applying a policy as appropriate.

From there, it pushes the work into our change tools. Tasks flow to Intune, Autopatch, and Enterprise App Management so the remediation is traceable. Exceptions are tracked, including data on owners, compensating controls, and review dates. Closure is verified by watching exposure decrease and confirming the fix landed with the intended devices.

How we use it

  • Review exposure by CVE, software, and device group to see where risk concentrates.
  • Prioritize based on business impact, internet exposure, and privilege level so high‑value targets move first.
  • Select the fix that fits the issue, including app updates through Enterprise App Management, OS and quality updates through Autopatch or Hotpatch (where supported), firmware and drivers through Intune update management, or policy changes for configuration weaknesses.
  • Target the right scope using tags for model, region, and business unit so remediation lands where it’s needed.
  • Set deadlines and user experience settings that balance urgency with productivity.
  • Validate closure by rechecking exposure, confirming install success, and watching support signals for regressions.

What we monitor

  • Exposure trends over time, to prove that remediation is reducing risk.
  • Top vulnerable apps and models, so effort tracks where it matters most.
  • Noncompliant devices and owners, so follow‑ups are direct and accountable.
  • Exceptions that need compensating controls, documented rationale, and a review date.

Why it helps

  • Fewer handoffs because the same team that sees risk can initiate remediation.
  • Measurable outcomes because exposure and deployment data live in connected systems.
  • Consistent execution because rings, tags, and approvals follow the same patterns as other updates.

Best practices

  • Keep device tags authoritative so targeting and reporting stay reliable.
  • Use pilots even for urgent fixes to catch compatibility issues before broad rollout.
  • Link vulnerability records to Intune assignments so audit and learning loops are clear.
  • Communicate clearly with affected users about timing, restarts, and how to report problems.
  • Document exceptions with owners and expiration dates so temporary holds don’t become permanent.

Notes

  • Not every fix is an update, and some issues require a configuration change or feature disablement with clear rollback steps.
  • Least‑privilege access and standard approvals keep remediation fast without expanding risk.

Key takeaways

Our approach for managing devices and updates has changed. We shifted device and update management from manual hunting and ad hoc remediation to a connected loop that starts with a question and ends with verified resolution—reducing investigation time and speeding recovery.

A few lessons stand out:

  • Make natural language work by grounding it in trust. Natural language becomes a force multiplier when insights are drawn from authoritative data and access is tightly scoped.
  • Keep pilots small, fast, and intentional. Focused pilots surface issues early without slowing momentum or introducing unnecessary risk.
  • Standardize signals to build confidence. Consistent tagging and clear ownership make reports, deployment rings, and rollbacks easier to interpret and trust.
  • Control exceptions with discipline. Every exception requires a written rationale and a review date, ensuring temporary holds don’t become permanent policy.
  • Close the loop—every time. Verification matters as much as detection. We confirm outcomes and capture learnings to continuously improve the next cycle.

What we’re improving next:

  • Strengthen question‑to‑action flows. We’re deepening prompts and playbooks that connect Security Copilot and Intune so operators can move from investigation to scoped change in a single flow.
  • Expand Hotpatch adoption and measurement. As support broadens, we’re increasing usage and measuring the impact on downtime, reliability, and user experience.
  • Grow app coverage with clearer stability rules. We’re expanding Enterprise App Management while enforcing stronger version‑pinning guidance where predictability is critical.
  • Automate deployment decisions. Additional automation around ring placement, readiness checks, and rollback triggers will allow deployments to adapt to live health signals.
  • Accelerate investigations with reusable telemetry. We’re developing richer telemetry patterns and reusable KQL in Visual Studio Code to reduce noise and speed repeat investigations.

It’s a continuing evolution of our awareness and capabilities in device management, and we’ll keep improving on it, one loop at a time.

Recent