Encrypted file formats allow for sensitive information to be hidden from unwanted eyes. It is arguably a feature to use encrypted files to hide sensitive information where it turns out many governments, educational institutions, and public/private companies encourage this practice (at least for some users). Unfortunately, this usually hides data from being inspected by security and information governance solutions.
Storing sensitive data in encrypted files works to a degree, and only when the encryption algorithm and the key (e.g. password) are of sufficient strength and the decryption process does not compromise things. It's fair to say that `password` is used too often to encrypt files - not because it's a good encryption key, but because humans are lazy. Plus, to decrypt files, either the password (or key) needs to be shared or a rights management system needs to be employed, creating some practical barriers. Plus, can often be used to decrypt files containing sensitive information.
Arguably, the bigger risk is when compression and encryption are combined. Data compression is often coupled with file encryption - e.g. zip files and other variants. When you combine data compression with encryption, the volume of sensitive information at risk is greatly increased. And you can be sure that illicit actors are aware of this during their theft of sensitive information.
Even with the above challenges, encrypted files are still generally effective and used by many organizations. Regardless of any potential or practical limitations, there is a great need for better file format handling of encrypted files. Better file format handling includes smart and complete detection and decryption.
What are Encrypted Files?
Before we discuss the solution, let's learn more about encrypted files.
There are a few flavors of encrypted files:
- application-specific formats - that allow for optional encryption usually with a password or other forms of a secret key
- container type formats - like a `Zip` file (or other variants) that allow for optional encryption of files stored within the container
- specially designed formats - that rewrite a file into an encryption-specific format
- hybrid of #1 and #3
There are many forms of application-specific formats within word processing, spreadsheet, presentation, and other format categories which allow for optional file encryption. Users of these applications typically need to turn on encryption (usually by enabling password protection) as they create or save the file in the native application. With no standards for how to enable encryption, each application tends to be different and thus sometimes difficult to use. Sometimes the encryption excludes the document properties, leaving metadata at risk. Microsoft Office, Apple iWork, OpenOffice, Adobe PDF and many others support optional encryption within their productivity applications. Despite some difficulty with real-world use, many applications do at least allow for encrypted files and are used in the wild.
Besides the ubiquitous `Zip` container (aka archive) file format, there are many others in this category like 7-Zip, RAR, Stuffit (popular on MacOS), B1 and many others. Some are old, some are new, and some are common with a particular operating system, while others are available everywhere. Many, but not all, container file formats offer optional encryption alongside their standard compression. And to make matters more complicated, some of these tools offer multiple encryption algorithms and other tuning settings. This results in the need to support all variations. An internet query for `password protecting files with PII` will give you a sense of how many organizations openly encourage the use of encrypted container files.
Several vendors offer specially encrypted file formats like Seclore, Voltage SmartCipher, Microsoft RMS PFILES, OpenPGP, AxCrypt and others. These applications encrypt the original file into a new encrypted file format. A key (e.g. password) plus a decryption application or rights management service support the decryption and thus end-user use of the original file. Multiple encryption algorithms and other tuning settings are also typically available with these tools. Historically, this approach has been used for specific use cases where security is paramount but is seeing more widespread use currently.
Seclore and Microsoft also both apply encryption within some applications’ native files like Microsoft Word, Excel and PowerPoint. On the surface, these files look like regular DOCX, XLSX and PPTX files, but are encrypted and rights-managed. A rights management service is used to decrypt. These are particularly tricky since extra inspection is needed to know whether they are encrypted and by which technology.
The sophistication and complexity of encrypted files are surprising.
What is the Solution for Handling Encrypted Files?
The simple answer is to leverage a file format processing technology that can identify as many of these encrypted file formats and extract/decrypt as much of the content as possible. Micro Focus IDOL KeyView is one of those technologies. Other commercially available and open-source alternatives cover, at best, a small subset of what IDOL KeyView supports. This theme is also true for other file format categories like CAD, Business Intelligence (including analytics, databases, and big data), scientific, container (aka archive) and even Office productivity suites.
IDOL KeyView is a collection of embeddable SDKs that enable file format detection, content decryption, metadata and text extraction, subfile processing, non-native rendering, and structured export.
As of the 12.13 release, IDOL KeyView understands over 1700 file formats with over 50 of these formats having explicit or optional encryption that KeyView can identify. And these 50+ include the most used file formats like suites from Microsoft Office, Apple iWork, OpenOffice and others like PDF, Zip, 7-Zip, RAR, Seclore, SmartCipher and many others. As more applications support optional encryption, KeyView will continue to expand its support.
Let's explore the KeyView Filter SDK capabilities related to encrypted files in more detail:
- detection - smart and complete
- sub-file processing - for XrML data from RMS-protected files used for decryption
- inline decryption
File format detection must be smart and complete. Smart means not being fooled by bogus or non-existent file extensions. Complete means reporting the format versions and encryption status while covering as many types of file formats as possible.
Here's the output for a Zip file encrypted with Secure Zip SmartCrypt (scenario #2 above):
The unique, KeyView. The Attributes: 2048 means it is encrypted with SmartCrypt. The File Class: 8 is for the encapsulation category.
Here's the example output for a SmartCipher encrypted text file (scenario #3 above):
The unique KeyView format code for a SmartCipher file is 1255. The Attributes: 1 means it is encrypted. The File Class: 8 is for the encapsulation category.
Here's the output for a Microsoft Excel (DOCX) file protected with Microsoft RMS via Azure Information Protection (AIP) (scenario #4 above):
The unique KeyView format code for an encrypted Office file is 370. Due to encryption, a more precise detection based on the contents is not possible. This behavior is typical and occurs with Apple iWork and others. The Attributes: 33 means it is encrypted with Microsoft RMS. The File Class: 12 is for the miscellaneous category.
Sometimes just knowing that a file is encrypted is enough. But it is more valuable if you can decrypt it and get out the interesting bits for further inspection and processing.
Certain types of encrypted files contain . XrML stands for eXtensible Rights Markup Language, now maintained as an ISO standard. Microsoft RMS-protected files use XrML. Here's an example of KeyView extracting the embedded XrML file:
The data within the XrML file is essential to being able to decrypt RMS-protected files. KeyView Filter SDK (and Panopticon SDK) can pass on the encrypted file including the XrML data and the necessary credentials to an AIP server for decryption.
Here’s an example of KeyView decrypting and extracting the text from an RMS encrypted file:
For encrypted files protected by other encryption means, 3rd party password cracking tools employ various techniques to generate the decryption password. The KeyView Filter SDK API can accept credentials (e.g. password) when processing a file.
Here's the output when trying to extract sub-files from a password-protected zip file:
Notice that KeyView Filter SDK reports a return code: 8 meaning Password Protected. In case you didn't know, Zip files can contain both encrypted and unencrypted files.
Here's the output of the same sub-file extraction, but when the password is provided:
The sub-file is extracted and now can be further processed.
For security and information governance applications to provide smart policies that permit and/or block the transmission of potentially sensitive information or identify for legal and compliance purposes, these applications need to know
- if an attachment to an email is an encrypted file (or not),
- if what's being transmitted to the cloud or simply being stored is encrypted (or not),
- the specific flavor of the encrypted file to determine if it's permitted (or not)
- and when possible, support decryption for deeper inspection of the content
These are things that IDOL KeyView excels at.
You can at https://www.microfocus.com/keyview.
The Micro Focus IM&G team
Know your data | empower your people | drive your future Join our community | @microfocusimg | www.microfocus.com