7 min read time

How to Persistently Protect Healthcare Data

by   in Cybersecurity

In the healthcare industry, it’s all about patient data. Patient data is collected by doctors and hospitals for diagnosis, and treatment. Clinicians, Pharmacies, and insurance companies use collected healthcare data for various predictive analytics use cases. Here are a few of these use cases which are very common for predictive analytics in healthcare:

  1. Predictive Analytics to determine the possibility of certain medical conditions based on current health parameters. i.e diabetes, stroke severe heart-related conditions.
  2. Disease progression in early stages
  3. Determining the best possible line of treatment
  4. Detecting anomalies/frauds in insurance claims

How to persistently protect healthcare dataThe predictive analytics is done on patient proprietary information aka Protected Health Information (PHI) and Personally Identifiable Information (PII). All the major healthcare providers have now started to use modern cloud platforms for data storage and processing purposes. As you can imagine, this along with inducing risk to the sensitive data gets these healthcare Organizations striving to achieve compliance with not only regional regulations like GDPR, CCPA, LGPD, etc, but also industry sectoral regulations like HIPAA and HITECH.

Healthcare data is very complex as it exists in diverse formats and is scattered across locations. Here are a few challenges which make it difficult to share across the boundaries of an enterprise and beyond.

  1. Most of the healthcare data is stored in multiple places i.e collected through different devices, and gadgets and at times collected manually by doctors for medical diagnoses and prescriptions.
  2. Healthcare data exists in both structured and unstructured formats which makes data fusion and predictive analytics even more difficult. Different types of data, like electronic health records (EHR), electronic medical records (EMR), text, images, audio, and video files.
  3. The increase in the number of reported data breaches related to healthcare data requires these organizations to protect sensitive customer data from breach while leveraging the power of the secure cloud and web analytics to support data-driven decision making
  4. The ability to reduce costs and risk while safely enabling data monetization. Enterprises do not have the necessary tools to tackle this complex problem while complying with tough security and industry regulations. 

Let’s try to understand through a business scenario the nuances associated with healthcare data, and the solution that we offer through our portfolio product Voltage SecureData. Through this example, we will explain how Voltage SecureData solves the above problem while allowing organizations to meet compliance and regulatory requirements. Here is a relevant business scenario:

A global healthcare and health solutions company, and a provider of health insurance pharmacy benefits, holds very sensitive data for its customers. Their IT organization is an early adopter of advanced technologies to support business initiatives, while also putting customer data security and privacy at the forefront of requirements for all their projects. They discovered that: 

  1. hosting sensitive personal and healthcare data in a data lake, posed a major security challenge. Anyone required to carry out predictive analytics/regular analytics had access to all data in the data lake. The limited number of data scientists couldn’t keep up with the analytics needs of the business.
  2. They needed a solution that performs a lexical analysis of the data to recognize PII/PHI and automatically anonymizes the data using customizable rules provided by the organization. Also, the solution identifies the sensitive information for different types of data, like electronic health records (EHR), electronic medical records (EMR), text, images, audio, and video files. In the case of audio, video, and image files, data is extracted and analyzed for sensitive information. The extracted data is deidentified in a way that its usability is preserved in its encrypted/tokenized form
  3. They needed this deidentified data to be persistently encrypted/tokenized while allowing it to be accessed by developers, marketers, and other functions to accelerate insights and get value from the technology investment be it on-premises, single cloud, or multi-cloud environment
  4. Need for a solution that can easily be used with virtually any system, ranging from decades-old custom applications to the latest enterprise programs.
  5. They needed a solution that offers various techniques to mask information from the data while preserving its integrity to perform meaningful analytics and allow sharing it across enterprise boundaries and beyond while complying with regulations.
  6. The solution should also provide a way for users to validate the data for accuracy before creating sharable datasets.
  7. The solution should meet the HIPAA safe harbor provision which is part of the HIPAA Privacy rule, which limits the possible uses and disclosures of protected health information. The HIPAA safe harbor method is a method of de-identification of protected health information, which provides prescriptive guidance on how certain data elements need to be de-identified. Per that guidance, the following fields are anonymized before sharing across organizations or entities

The set of solutions that CyberRes Data Privacy and Protection portfolio offers addresses all the above requirements that this healthcare and health solutions company and a provider of health insurance pharmacy benefits is looking forward to:

  1. 1. Voltage File Analysis Suite for Unstructured Sensitive Healthcare data –A cloud solution with the ability to quickly find sensitive healthcare data, classify high-risk data, and secure data to minimize privacy risk. It has Contextually aware entity detection for 39+ countries and the latest privacy regulations. It allows quick estimation of the amount of sensitive data, prioritizes critical data, and protects data access.

Voltage File Analysis Reference Architecture

Here is a quick snapshot of File Analysis Suite (FAS) capabilities 

  • Assess sensitive healthcare data risk: Scan your healthcare data to discover what sensitive healthcare data is held, where it is located, and the amount of risk it carries.
  • Tag and classify sensitive healthcare data: After discovering your sensitive healthcare data, classify it with customizable tags for better organization.

File Analysis Suite

  • Optical character recognition: Ensure images, scanned documents, and other media with sensitive healthcare information are classified and protected.

2. Voltage Structured Data Manager - SDM can manage structured healthcare data over its entire lifecycle. Providing data discovery, insight, protection, and management while reducing the TCO of application infrastructure.

Voltage SDM Reference Architecture

Voltage SDM Reference Architecture

Here is a quick snapshot of SDM capabilities: 

  • Privacy protection: Discover, analyze, and protect sensitive healthcare data and continuously monitor and manage the data lifecycle.
  • Data discovery: Scan for personal and sensitive healthcare data in databases, classify your data and generate remediation processes.
  • Test data management: Automate privacy and protection of sensitive production data for use in pipelines for testing, training, and QA.

3. Voltage Secure Data Enterprise - Secure data continuously with our leading format-preserving enterprise data protection techniques to address privacy compliance.

Options for Enabling Secure Data Analytics with Voltage SecureData

Options for Enabling Secure Data Analytics with Voltage SecureData

Here is a quick snapshot of Voltage SecureData Enterprise capabilities: 

  • Continuous data protection: A cyber-resilient enterprise data protection platform that protects data over its entire lifecycle.
  • Data-centric, proven at a global scale: Protection where the access policy travels with the dataitself, without changes to format or integrity. Voltage encryptiontokenization, hashing, and masking. Protect structured, and unstructured data Reversible or one-way protection
  • Cloud-native data protection: Data-centric securityideal for the safe deployment of applications, data, and analytics in the cloud. Tokenize and encrypt any data 
    • PCI, PII, PHI data
    • Intellectual property
    • IoT, geolocation
    • US, Latin, Unicode

Let’s look quick scenario-based illustration of how our format-preserving encryption preserves referential integrity. In a typical healthcare database, we would have multiple tables with a specific identifier being common across multiple tables with sensitive healthcare data stored. In the below image, we have a national identity as the unique identifier:

It's all about the data

Now when data masking is performed, it obfuscates all the sensitive data elements with cross which destroys value of protected healthcare data making it unusable for predictive analytics:

Data Masking

Now with Voltage SecureData offering format preserving encryption referential integrity is preserved which ensures predictive data analytics can be executed as per business requirement while meeting all regional and industry sectoral regulatory compliance requirements:

Referential Integrity

Additionally, as per HIPPA personally identifiable patient information such as DOB, Postal codes needs special handling. All such requirements can be met easily through various encryption techniques offered by Voltage SecureData i.e Partial FPE, FPE & Anonymization or Generalization.


The specific needs of the healthcare sector around predictive analytics and sharing of data exposes it to data security risks as it contains personally identifiable information (PII) and protected health information (PHI). If the organization doesn’t have the capability to persistently protect sensitive healthcare data and perform analytics in a protected format, then it might not be able to use this data for further research, treatment and predicting health conditions. 

Voltage Data Discovery and Protection solutions ensure that sensitive healthcare data is identified and tagged, and the exchange of sensitive data occurs in a protected format without any risk of PII/PHI being exposed. With Voltage solutions, healthcare organizations can take full advantage of the benefits of cloud analytics platforms, including SaaS-based web analytics, prevent exposure of sensitive customer data, and realize substantial cost savings in the process.

Contact With Us

Join our Voltage Data Privacy and Protection Community. Keep up with the latest Tips & Info about Data Privacy and Protection. We’d love to hear your thoughts on this blog. Log in or register to comment below.


Data Privacy and Protection
  • Hey Vishwameet,

    As someone knee-deep in the world of data security, I can totally relate to the challenges healthcare organizations face with diverse data formats and scattered locations. Been there, done that.

    Your mention of Voltage SecureData caught my eye. I've actually used it in a similar scenario. The lexical analysis feature, recognizing PII/PHI, is a game-changer. I had a client facing the same issue of data lake vulnerability. Voltage came to the rescue by automating anonymization based on custom rules, ensuring only relevant access. Plus, the persistently encrypted/tokenized deidentified data was accessible to various teams, from developers to marketers.

    The File Analysis Suite and Structured Data Manager seem spot-on for sensitive healthcare data risk assessment and management. So, I gotta give props to Voltage for keepin' it real with their format-preserving encryption. That stuff is crucial for predictive analytics, ya feel me? It's all about maintainin' that referential integrity, and Voltage's got it on lock. Mad respect for that game-changing move.

    So, peep this dope article I stumbled upon, fam! It's called "EHR/EMR Systems: Top Benefits Of Medical Devices Integration for Your Healthcare Businesses" It breaks down the mad benefits of hooking up electronic health/medical records systems for healthcare biz. Considering our chat about data security and healthcare solutions, I figured you might vibe with it. Check it out )))

    My hot tip? Dive into Vishwameet and check out Voltage's dope offerings for a mind-blowing solution. It's been an absolute unique experience for me, a total game-changer.