Breaches of sensitive data are extremely costly for organizations when you tally data loss, stock price impact, and mandated fines from violations of General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), or other regulations. They also can diminish the trust of those who become the victims of identity theft, credit card fraud, or other malicious activities as a result of those breaches. In 2021, the number of data breaches climbed 68 percent to 1,862 (the highest in 17 years) with an average cost of USD4.24 million each.1 About 45 million people were impacted by healthcare data breaches alone—triple the number impacted just three years earlier.2
Sensitive data is confidential information collected by organizations from customers, prospects, partners, and employees. Common types of sensitive data include credit card numbers, personally identifiable information (PII) like a home address and date of birth, Social Security Numbers (SSNs), corporate intellectual property (IP) like product schematics, protected health information (PHI), and medical record information that could be used to identify an individual.
Every level of an organization—from IT operations and red and blue teams to the board of directors— could be affected by a data breach. How do organizations identify sensitive data at scale and prevent accidental exposure of that data? Let’s look at four of the biggest challenges of sensitive data and strategies for protecting it.
1. Discovering where sensitive data lives
The data discovery process can surprise organizations—sometimes in unpleasant ways. Sensitive data can live in unexpected places within your organization. For instance, an employee may have stored a customer’s SSN in an unprotected Microsoft 365 site or third-party cloud without your knowledge. Of an estimated 294 million people hacked in 2021, about 164 million were at risk because of data exposure events—when sensitive data is left vulnerable online.3
The only way to ensure that your sensitive data is stored properly is with a thorough data discovery process. Scans for data will pick up those surprise storage locations. However, it’s close to impossible to handle manually.
2. Classifying data to learn what’s most important
That leads right into data classification. Once the data is located, you must assign a value to it as a starting point for governance. The data classification process involves determining data’s sensitivity and business impact so you can knowledgeably assess the risks. This will make it easier to manage sensitive data in ways to protect it from theft or loss.
Microsoft uses the following classifications:
- Non-business: Data from your personal life that doesn’t belong to Microsoft.
- Public: Business data freely available and approved for public consumption.
- General: Business data not meant for a public audience.
- Confidential: Business data that can cause harm to Microsoft if overshared.
- Highly confidential: Business data that would cause extensive harm to Microsoft if overshared.
Identifying data at scale is a major challenge, as is enforcing a process so employees manually mark documents as sensitive. Leveraging security products that enable auto-labeling of sensitive data across an enterprise is one method, among several that help overcome these data challenges.
3. Protecting important data
After classifying data as confidential or highly confidential, you must protect it against exposure to nefarious actors. Ultimately, the responsibility of preventing accidental data exposure falls on the Chief Information Security Officer (CISO) and Chief Data Officer. They are accountable for protecting information and sharing data via processes and workflows that enable protection, while also not hindering workplace productivity.
Data leakage protection is a fast-emerging need in the industry. The Allianz Risk Barometer is an annual report that identifies the top risks for companies over the next 12 months. For the 2022 report, Allianz gathered insights from 2,650 risk management experts from 89 countries and territories. Cyber incidents topped the barometer for only the second time in the survey’s history. At 44 percent, cyber incidents ranked higher than business interruptions at 42 percent, natural catastrophes at 25 percent, and pandemic outbreaks at 22 percent.4
4. Governing data to reduce unnecessary data risks
Data governance ensures that your data is discoverable, accurate, trusted, and can be protected. Successfully managing the lifecycle of data requires that you keep data for the right amount of time. You don’t want to store data longer than necessary because that increases the amount of data that could be exposed in a breach. And you don’t want to delete data too quickly and put your organization at risk of regulatory violations. Sometimes, organizations collect personal data to provide better services or other business value. For instance, you may collect personal data from customers who want to learn more about your services. To abide by the data minimization principle, once the data is no longer serving its purpose, it must be deleted.
How to approach sensitive data
Considering the potentially costly consequences, how do you protect sensitive data? As mentioned earlier, data discovery requires locating all the places where your sensitive data is stored. This is much easier with support for sensitive data types that can identify data using built-in or custom regular expressions or functions. Since sensitive data is everywhere, we recommend looking for a multicloud, multi-platform solution that enables you to leverage automation.
For data classification, we advise enforcing a plan through technology rather than relying on users. After all, people are busy, can overlook things, or make errors. Also, organizations can have thousands of sensitive documents, making manual identification and classification of data untenable because the process would be too slow and inaccurate. Look for data classification technology solutions that allow auto-labeling, auto-classification, and enforcement of classification across an organization. Trainable classifiers identify sensitive data using data examples.
Some solution providers divorce productivity and compliance and try to merely bolt-on data protection. Instead, we recommend an approach that integrates data protection into your existing processes to protect sensitive data. When considering plan protections, ask: Who can access the data? Where should the data live and where shouldn’t it live? How can the data be used?
Microsoft solutions offer audit capability where data can be watched and monitored but doesn’t have to be blocked. It can be overridden too so it doesn’t get in the way of the business. Also, consider standing access (identity governance) versus protecting files. Data leakage protection tools can protect sensitive documents, which is important because laws and regulations make companies accountable.
Explore data protection strategies
Security breaches are very costly. Data discovery, data classification, and data protection strategies can help you find and better protect your company’s sensitive data. Learn more about how to protect sensitive data.
To learn more about Microsoft Security solutions, visit our website. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.
1Cost of a Data Breach Report 2021, Ponemon Institute, IBM. 2021.
2Cyberattacks Against Health Plans, Business Associates Increase, Jill McKeon, HealthITSecurity xtelligent Healthcare Media. January 31, 2022.
3Despite Decades of Hacking Attacks, Companies Leave Vast Amounts of Sensitive Data Unprotected, Cezary Podkul, ProPublica. January 25, 2022.
4Allianz Risk Barometer 2022: Cyber perils outrank Covid-19 and broken supply chains as top global business risk, Allianz Risk Barometer. January 18, 2022.
6Fines for breaches of EU privacy law spike sevenfold to $1.2 billion, as Big Tech bears the brunt, Ryan Browne, CNBC. January 17, 2022.