The explosion of unstructured data requires organizations to carefully think about how to protect their data. Organizations must take a holistic, yet balanced approach emphasizing robust data discovery and strong protection capabilities to mitigate threats to sensitive data. Many companies are moving away from relational databases management systems (RDBMS) – structured data, to technologies like NoSQL (not only SQL) because of its flexibility and powerful search capabilities in native form. The use of data lakes and elastic search plays a huge role in this. IDC analysts are predicting data will grow to 163 zettabytes by 2025, of which 80% to 90% of that will be unstructured data.
What Are Common Forms of Unstructured Data?
Unstructured data can come in many forms and can consist of any of the following:
Human-generated – includes data in the form of business productivity files and documents (Word/Excel), YouTube videos, social media posts, text messages, audio files, email messages, text files, and reports to name a few.
Machine-generated – includes data in the form of medical images, sensor data, scientific data, system logs and satellite images.
Unstructured data can be complex in nature which makes it difficult to determine an appropriate security policy to enforce. The type of data can range from PII (Personally Identifiable Information), financial records, intellectual property, and trade secrets, as well as sensitive human resource (HR) information. There is no consistency to unstructured data, and it could be dispersed in mixed cloud environments and cloud storage locations, on premises. Managing access to unstructured data can introduce various attack vectors that could expose sensitive data, which requires a comprehensive approach to mitigate threats to sensitive data dispersed through disparate cloud environments.
DLP Technology Falls Short
DLP, also known as Data Loss Prevention, is a technology used to ensure sensitive data is not lost, misused, or accessed by unauthorized users. It monitors, detects, and blocks sensitive data from being exfiltrated and leaving the enterprise environment. However, DLP technology has not kept pace with the growing rate of unstructured data in various forms, modern computing, and cloud acceleration. Market signals are becoming clearer, DLP technology is not dynamic enough to identify and enforce policy that prevents unstructured data leakage and exfiltration.
The complexity associated with unstructured data due to user and machine generated content, the lack of consistency for accessing and managing unstructured data, the complexity associated with managing access controls, and the lack of granular security controls on files created by users and machines all contribute to the risk of exposure for unstructured data.
A recent customer study conducted by Forrester revealed that customers are moving away from DLP technologies because they have fallen short in addressing modern data security challenges. The customer study summarizes four findings to include the following:
- DLP solutions do not fully support evolving security needs and requirements.
- Current DLP solutions are underutilized, partially because data security pros find the capabilities difficult to manage.
- Companies are investing in improvements to help with threat intelligence and improve incident detection, investigation, and response.
- Companies are taking a mix of user- and data-centric approaches to innovative solutions.
In addition to these points by Forrester research, the current workforce and users are more dispersed which adds to the complexity of DLP technology.
One of the biggest drawbacks pointed out in the study suggests that DLP lacks visibility and responsiveness to threats. This tremendously reduces an organization’s ability to identify and detect insider threat activity targeting troves of sensitive data in cloud and on-premises environments. Blind spots in data security strategies introduce significant risk to the organizations, and often is not felt until threat actors have exfiltrated data or walk data outside the organization. The customer study further asks respondents (as shown in Figure 1), “What steps is your company taking to address current gaps in insider threat capabilities?”
The customer study concludes that protecting data against threats (internal and external) requires a new approach given that current technology has limitations in data classification capabilities, with complicated rule creation that are underutilized and ineffective. The main goal should be to identify, detect, and prioritize risks quickly to minimize potential damage of sensitive data being exposed.
Developing a DLP + P Strategy
Threat intelligence curated from known attacks often point to data exfiltration as one of the main goals threat actors look to achieve. Threat actors target and attack organizations looking for several types of sensitive data in various forms. Whether it is nation state actors looking for intellectual property and trade secrets, PII of known users and partners that can be used in other targeted attacks, or competitive data that can be used to impact business and critical mission functions.
Threat actors will use various ATT&CK (Adversarial Tactics, Techniques, and Common Knowledge) techniques and sub-techniques to exfiltrate and obtain access to sensitive data as shown in Figure 2. MITRE ATT&CK is curated threat intelligence from known attacks, it clarifies attacks methods used by threat actors in achieving their goals (known as Tactics).
To address the shortcomings in DLP technology and evolving threats to sensitive data, it is important to look for ways to build synergies through cross pillar (CxP) capabilities to help organizations reduce their risk around an expanded attack surface from the explosion of unstructured data.
I believe a core aspect of this entails formulating a holistic framework and strategy called Data Loss Prevention and Protection (DLPP). While DLP is a technology and set of tools, DLPP is a strategy that emphasizes the need to implement visibility, governance, and secure access capabilities for sensitive data.
It should be noted that one of the key findings in Gartner’s Market Guide for Data Loss Prevention suggest that DLP technology lack strong data governance, which result in inconsistent use cases and requirements, making it tough for DLP technology to be successful in today’s environments.
A holistic DLPP strategy incorporates the following:
- A robust Data Governance program
- Data Governance manages information through its entire lifecycle, how it is published, stored, archived, and retired/disposed. As part of a Data Governance strategy, it is important to define data classification, handling, and retention requirements.
- Assignment of departmental data governance ownership tasks to Business Data Owners who best understand the data and can define the types of tools to manage access and governance to sensitive data.
- There should also be language that define allowed access and how threats to sensitive data will be monitored.
- Data Access Governance (DAG) should be part of a holistic Data Governance strategy from both a data and identity perspective. DAG focuses on identifying and addressing threats that can come from inappropriate access to sensitive unstructured data. This is done by building and using identity constructs to govern the access to mission critical data.
- DAG incorporates data classification and discovery to properly protect sensitive data.
- Monitors and applies role-based policies to govern access control to sensitive data.
- Maps out how data, location, content, access structure, roles and users are related.
- Establishes visibility and baselines to determine who has access to sensitive data and assesses the type of access, and how they obtained it.
- Establish monitoring and alerting for detecting threats to sensitive data
- Gain insight and visibility around sensitive data that is unprotected and file sharing across the enterprise
- Leverage telemetry from User Entity Behavior Analytics (UEBA) to identify potential threats and anomalous access to sensitive data
- Gain insight and visibility around potential data exfiltration activity
- Monitor access, permission, and changes to sensitive data
- Alert data owners of potential changes
- Remediate, Mitigate and Take Action
- Reduce and eliminate duplicate and stale data to minimize the footprint of sensitive data
- Reduce, eliminate, and correct issues resulting in disposition, ownership, content, and rights
- Conduct access reviews for assuring that only “authorized” users have the right level of access to sensitive data – assess risk against business needs
- Remediate and correct issues with sensitive data – location, security, and ownership
- Update policies as needed to ensure security controls for sensitive data is commensurate to minimize exposure
- Data Protection – encrypt file level access to sensitive data
- Protect sensitive files, encrypt sensitive on disk while rest and in transit
- Apply lock down policies to secure locations to sensitive data
- Apply fencing policies based on groups/object levels to ensure right access to sensitive data, otherwise deny access.
Implementing a DLPP Strategy
DLP will eventually fade in the sunset as more organizations will require a more holistic approach to protecting sensitive data. DLPP does not emphasize technology, but it incorporates visibility, protection, automation, and governance to manage risks associated with unstructured data. A core and fundamental component to DLPP strategy is DAG, which aligns well with Voltage File Analysis Suite (FAS) and NetIQ suite, to secure sensitive data, and secure who has access to sensitive data. Leveraging CxP capabilities to formalize a DLPP strategy is an excellent building block for any zero-trust architecture.
DLP was designed to focus on protecting sensitive data and did a poor job at incorporating the identity of data’s users. Securing access to sensitive data is one of the goals outlined in NIST (National Institute of Standards and Technology) SP (Special Publication) 800-207, zero-trust guidance. Without strong identity access controls it increases the likelihood of insider threats and account abuse activities aimed at exfiltrating sensitive data. Having a sound strategy for prevention and protection in many ways mitigate the loss of sensitive data.
DLPP should be an essential component in organizations zero-trust journey. Implementing a DLPP strategy gives organizations the visibility, protection, automation, and governance needed to bolster their cyber resiliency against data breaches.
Join our Security Community | What is Cyber Resilience? | What is Cybersecurity? | Reimagining Cyber Podcast