Skip to main content
Microsoft Security

Microsoft Security Blog

Man at laptop in room

Mitigating Skeleton Key, a new type of generative AI jailbreak technique 

Microsoft recently discovered a new type of generative AI jailbreak method called Skeleton Key that could impact the implementations of some large and small language models. This new method has the potential to subvert either the built-in model safety or platform safety systems and produce any content. It works by learning and overriding the intent of the system message to change the expected behavior and achieve results outside of the intended use of the system.

Tailored AI insights from Microsoft Security Copilot

Empower your defenders to detect hidden patterns, harden defenses, and respond to incidents faster with generative AI.

Photo of developer evaluating data from intelligent apps built in Azure

AI jailbreaks: What they are and how they can be mitigated 

Microsoft security researchers, in partnership with other security experts, continue to proactively explore and discover new types of AI model and system vulnerabilities. In this post we are providing information about AI jailbreaks, a family of vulnerabilities that can occur when the defenses implemented to protect AI from producing harmful content fails. This article will be a useful reference for future announcements of new jailbreak techniques.

Retain Microsoft Security Experts

Microsoft Security Experts are now available to strengthen your team with managed security services. Learn how to defend against threats with security experts.

Two persons reviewing security information on a smart display on the wall in the mail room

Exposed and vulnerable: Recent attacks highlight critical need to protect internet-exposed OT devices 

Since late 2023, Microsoft has observed an increase in reports of attacks focusing on internet-exposed, poorly secured operational technology (OT) devices. Internet-exposed OT equipment in water and wastewater systems (WWS) in the US were targeted in multiple attacks over the past months by different nation-backed actors, including attacks by IRGC-affiliated “CyberAv3ngers” in November 2023, as […]