This post was originally published on the ParagonIE blog and reposted here with their permission.
We [ParagonIE] get asked the same question a lot (or some remix of it).
This question shows up from time to time in open source encryption libraries bug trackers. This was one of the weird problems covered in my talk at B-Sides Orlando (titled Building Defensible Solutions to Weird Problems), and weve previously dedicated a small section to it in one of our white papers.
You know how to search database fields, but the question is, How do we securely encrypt database fields but still use these fields in search queries?
Our secure solution is rather straightforward, but the path between most teams asking that question and discovering our straightforward solution is fraught with peril: bad designs, academic research projects, misleading marketing, and poor threat modeling.
If youre in a hurry, feel free to skip ahead to the solution.
Lets start with a simple scenario (which might be particularly relevant for a lot of local government or health care applications):
Lets first explore the flaws with the obvious answers to this problem.
The most obvious answer to most teams (particularly teams that dont have security or cryptography experts) would be to do something like this:
In the above snippet, the same plaintext always produces the same ciphertext when encrypted with the same key. But more concerning with ECB mode is that every 16-byte chunk is encrypted separately, which can have some extremely unfortunate consequences.
Formally, these constructions are not semantically secure: If you encrypt a large message, you will see blocks repeat in the ciphertext.
In order to be secure, encryption must be indistinguishable from random noise to anyone that does not hold the decryption key. Insecure modes include ECB mode and CBC mode with a static (or empty) IV.
You want non-deterministic encryption, which means each message uses a unique nonce or initialization vector that never repeats for a given key.
There is a lot of academic research going into such topics as homomorphic, order-revealing, and order-preserving encryption techniques.
As interesting as this work is, the current designs are nowhere near secure enough to use in a production environment.
For example, order-revealing encryption leaks enough data to infer the plaintext.
Homomorphic encryption schemes are often repackaging vulnerabilities (practical chosen-ciphertext attacks) as features.
As weve covered in a previous blog post, when it comes to real-world cryptography, confidentiality without integrity is the same as no confidentiality. What happens if an attacker gains access to the database, alters ciphertexts, and studies the behavior of the application upon decryption?
Theres potential for ongoing cryptography research to one day produce an innovative encryption design that doesnt undo decades of research into safe cryptography primitives and cryptographic protocol designs. However, were not there yet, and you dont need to invest into a needlessly complicated research prototype to solve the problem.
I dont expect most engineers to arrive at this solution without a trace of sarcasm. The bad idea here is, because you need secure encryption (see below), your only recourse is to query every ciphertext in the database and then iterate through them, decrypting them one-by-one and performing your search operation in the application code.
If you go down this route, you will open your application to denial of service attacks. It will be slow for your legitimate users. This is a cynics answer, and you can do much better than that, as well demonstrate below.
Lets start by avoiding all the problems outlined in the insecure/ill-advised section in one fell swoop: All ciphertexts will be the result of an authenticated encryption scheme, preferably with large nonces (generated from a secure random number generator).
With an authenticated encryption scheme, ciphertexts are non-deterministic (same message and key, but different nonce, yields a different ciphertext) and protected by an authentication tag. Some suitable options include: XSalsa20-Poly1305, XChacha20-Poly1305, and (assuming its not broken before CAESAR concludes) NORX64-4-1. If youre using NaCl or libsodium, you can just use crypto_secretbox here.
Consequently, our ciphertexts are indistinguishable from random noise, and protected against chosen-ciphertext attacks. Thats how secure, boring encryption ought to be.
However, this presents an immediate challenge: We cant just encrypt arbitrary messages and query the database for matching ciphertexts. Fortunately, there is a clever workaround.
Before you begin, make sure that encryption is actually making your data safer. It is important to emphasize that encrypted storage isnt the solution to securing a CRUD app thats vulnerable to SQL injection. Solving the actual problem (i.e. preventing the SQL injection) is the only way to go.
If encryption is a suitable security control to implement, this implies that the cryptographic keys used to encrypt/decrypt data are not accessible to the database software. In most cases, it makes sense to keep the application server and database server on separate hardware.
Possible use-case: Storing social security numbers, but still being able to query them.
In order to store encrypted information and still use the plaintext in SELECT queries, were going to follow a strategy we call blind indexing. The general idea is to store a keyed hash (e.g. HMAC) of the plaintext in a separate column. It is important that the blind index key be distinct from the encryption key and unknown to the database server.
For very sensitive information, instead of a simple HMAC, you will want to use a key-stretching algorithm (PBKDF2-SHA256, scrypt, Argon2) with the key acting as a static salt, to slow down attempts at enumeration. We arent worried about offline brute-force attacks in either case, unless an attacker can obtain the key (which must not stored in the database).
So if your table schema looks like this (in PostgreSQL flavor):
You would store the encrypted value in humans.ssn. A blind index of the plaintext SSN would go into humans.ssn_bidx. A naive implementation might look like this:
A more comprehensive proof-of-concept is included in the supplemental material for my B-Sides Orlando 2017 talk. Its released under the Creative Commons CC0 license, which for most people means the same thing as public domain.
Depending on your exact threat model, this solution leaves two questions that must be answered before it can be adopted:
Given our example above, assuming your encryption key and your blind index key are separate, both keys are stored in the webserver, and the database server doesnt have any way of obtaining these keys, then any attacker that only compromises the database server (but not the web server) will only be able to learn if several rows share a social security number, but not what the shared SSN is. This duplicate entry leak is necessary in order for indexing to be possible, which in turn allows fast SELECT queries from a user-provided value.
Furthermore, if an attacker is capable of both observing/changing plaintexts as a normal user of the application while observing the blind indices stored in the database, they can leverage this into a chosen-plaintext attack, where they iterate every possible value as a user and then correlate with the resultant blind index value. This is more practical in the HMAC scenario than in the e.g. Argon2 scenario. For high-entropy or low-sensitivity values (not SSNs), the physics of brute force can be on our side.
A much more practical attack for such a criminal would be to substitute values from one row to another then access the application normally, which will reveal the plaintext unless a distinct per-row key was employed (e.g. hash_hmac('sha256', $rowID, $masterKey, true) could even be an effective mitigation here, although others would be preferable). The best defense here is to use an AEAD mode (passing the primary key as additional associated data) so that the ciphertexts are tied to a particular database row. (This will not prevent attackers from deleting data, which is a much bigger challenge.)
Compared to the amount of information leaked by other solutions, most applications threat models should find this to be an acceptable trade-off. As long as youre using authenticated encryption for encryption, and either HMAC (for blind indexing non-sensitive data) or a password hashing algorithm (for blind indexing sensitive data), its easy to reason about the security of your application.
However, it does have one very serious limitation: It only works for exact matches. If two strings differ in a meaningless way but will always produce a different cryptographic hash, then searching for one will never yield the other. If you need to do more advanced queries, but still want to keep your decryption keys and plaintext values out of the hands of the database server, were going to have to get creative.
It is also worth noting that, while HMAC/Argon2 can prevent attackers that do not possess the key from learning the plaintext values of what is stored in the database, it might reveal metadata (e.g. two seemingly-unrelated people share a street address) about the real world.
Possible use-case: Encrypting peoples legal names, and being able to search with only partial matches.
Lets build on the previous section, where we built a blind index that allows you to query the database for exact matches.
This time, instead of adding columns to the existing table, were going to store extra index values into a join table.
The reason for this change is to normalize our data structures. You can get by with just adding columns to the existing table, but its likely to get messy.
The next change is that were going to store a separate, distinct blind index per column for every different kind of query we need (each with its own key). For example:
Every index needs to have a distinct key, and great pains should be taken to prevent blind indices of subsets of the plaintext from leaking real plaintext values to a criminal with a knack for crossword puzzles. Only create indexes for serious business needs, and log access to these parts of your application aggressively.
Thus far, all of the design propositions have been in favor of allowing developers to write carefully considered SELECT queries, while minimizing the number of times the decryption subroutine is invoked. Generally, that is where the train stops and most peoples goals have been met.
However, there are situations where a mild performance hit in search queries is acceptable if it means saving a lot of disk space.
The trick here is simple: Truncate your blind indexes to e.g. 16, 32, or 64 bits, and treat them as a Bloom filter:
It may also be worth converting these values from a string to an integer, if your database server will end up storing it more efficiently.
I hope Ive adequately demonstrated that it is not only possible to build a system that uses secure encryption while allowing fast queries (with minimal information leakage against very privileged attackers), but that its possible to build such a system simply, out of the components provided by modern cryptography libraries with very little glue.
If youre interested in implementing encrypted database storage into your software, wed love to provide you and your company with our consulting services. Contact ParagonIE if youre interested.
Go here to see the original:
How to Search on Securely Encrypted Database Fields - SitePoint
- WhatsApp overhauling status tab with encrypted Snapchat Stories-like feature - 9 to 5 Mac [Last Updated On: February 21st, 2017] [Originally Added On: February 21st, 2017]
- GOP demands inquiry into EPA use of encrypted messaging apps - CNET [Last Updated On: February 21st, 2017] [Originally Added On: February 21st, 2017]
- Encryption Apps Help White House Staffers Leakand Maybe Break the Law - WIRED [Last Updated On: February 21st, 2017] [Originally Added On: February 21st, 2017]
- World Wide Web Creator Calls for Internet Decentralization & Encryption - The Data Center Journal [Last Updated On: February 21st, 2017] [Originally Added On: February 21st, 2017]
- What It Means to Have an 'Adult' Conversation on Encryption - Pacific Standard [Last Updated On: February 21st, 2017] [Originally Added On: February 21st, 2017]
- Confide in me! Encryption app leaks sensitive info from Washington DC - SC Magazine UK [Last Updated On: February 21st, 2017] [Originally Added On: February 21st, 2017]
- Gmail v7.2 Prepares to Add Support for S/MIME Enhanced Encryption - XDA Developers (blog) [Last Updated On: February 26th, 2017] [Originally Added On: February 26th, 2017]
- Top 6 Data Encryption Solutions - The Merkle [Last Updated On: February 26th, 2017] [Originally Added On: February 26th, 2017]
- Your Guide to the Encryption Debate - Consumer Reports - ConsumerReports.org [Last Updated On: February 26th, 2017] [Originally Added On: February 26th, 2017]
- Google helps put aging SHA-1 encryption out to pasture - Engadget [Last Updated On: February 26th, 2017] [Originally Added On: February 26th, 2017]
- Decipher your Encryption Challenges - Infosecurity Magazine [Last Updated On: February 28th, 2017] [Originally Added On: February 28th, 2017]
- How the Politics of Encryption Affects Government Adoption - Freedom to Tinker [Last Updated On: February 28th, 2017] [Originally Added On: February 28th, 2017]
- How Encryption Makes Your Sensitive Cloud-Based Data an Asset, Not a Liability - Security Intelligence (blog) [Last Updated On: February 28th, 2017] [Originally Added On: February 28th, 2017]
- Set up VMware VM Encryption for hypervisor-level security - TechTarget [Last Updated On: February 28th, 2017] [Originally Added On: February 28th, 2017]
- How The Media Are Using Encryption Tools To Collect Anonymous Tips - NPR [Last Updated On: February 28th, 2017] [Originally Added On: February 28th, 2017]
- Encryption patent that roiled Newegg is dead on appeal | Ars Technica - Ars Technica [Last Updated On: February 28th, 2017] [Originally Added On: February 28th, 2017]
- Research proposes 'full-journey' email encryption - The Stack [Last Updated On: March 1st, 2017] [Originally Added On: March 1st, 2017]
- Database-as-a-service platform introduces encryption-at-rest - BetaNews [Last Updated On: March 1st, 2017] [Originally Added On: March 1st, 2017]
- Encrypted Messaging Service 'Signal' Adds Video Call Option - Top Tech News [Last Updated On: March 2nd, 2017] [Originally Added On: March 2nd, 2017]
- Germany, France lobby hard for terror-busting encryption backdoors ... - The Register [Last Updated On: March 2nd, 2017] [Originally Added On: March 2nd, 2017]
- How to Send Encrypted Nudes, a Guide for the Discerning Lover - Inverse [Last Updated On: March 2nd, 2017] [Originally Added On: March 2nd, 2017]
- Ironclad Encryption Corporation Announces New Ticker Symbol OTCQB: IRNC - Yahoo Finance [Last Updated On: March 2nd, 2017] [Originally Added On: March 2nd, 2017]
- The Best Email Encryption Software of 2017 | Top Ten Reviews [Last Updated On: March 2nd, 2017] [Originally Added On: March 2nd, 2017]
- No, you shouldn't delete Signal or other encrypted apps - TechCrunch [Last Updated On: March 11th, 2017] [Originally Added On: March 11th, 2017]
- Best encryption software: Top 5 - Computer Business Review [Last Updated On: March 11th, 2017] [Originally Added On: March 11th, 2017]
- Encryption Backdoors, Vault 7, and the Jurassic Park Rule of Internet Security - Just Security [Last Updated On: March 11th, 2017] [Originally Added On: March 11th, 2017]
- That Encrypted Chat App the White House Liked? Full of Holes - WIRED [Last Updated On: March 11th, 2017] [Originally Added On: March 11th, 2017]
- What the CIA WikiLeaks Dump Tells Us: Encryption Works - New York Times [Last Updated On: March 11th, 2017] [Originally Added On: March 11th, 2017]
- Snake-Oil Alert Encryption Does Not Prevent Mass-Snooping - Center for Research on Globalization [Last Updated On: March 11th, 2017] [Originally Added On: March 11th, 2017]
- Customer Letter - Apple [Last Updated On: March 11th, 2017] [Originally Added On: March 11th, 2017]
- Don't Let WikiLeaks Scare You Off of Signal and Other Encrypted Chat Apps - WIRED [Last Updated On: March 12th, 2017] [Originally Added On: March 12th, 2017]
- BT to offer customers encryption service for data - Capacity Media (registration) [Last Updated On: March 12th, 2017] [Originally Added On: March 12th, 2017]
- Encryption - technet.microsoft.com [Last Updated On: March 12th, 2017] [Originally Added On: March 12th, 2017]
- Use FileVault to encrypt the startup disk on ... - Apple Support [Last Updated On: March 12th, 2017] [Originally Added On: March 12th, 2017]
- Viber launches secret chats to go beyond encryption - SlashGear [Last Updated On: March 13th, 2017] [Originally Added On: March 13th, 2017]
- Zix wins 5-vendor email encryption shootout - Network World [Last Updated On: March 13th, 2017] [Originally Added On: March 13th, 2017]
- A lesson from the CIA WikiLeaks dump: Encryption works - The Seattle Times [Last Updated On: March 13th, 2017] [Originally Added On: March 13th, 2017]
- What the CIA WikiLeaks Dump Tells Us: Encryption Works - NewsFactor Network [Last Updated On: March 18th, 2017] [Originally Added On: March 18th, 2017]
- Panicked Secret Service Says It Lost Encrypted Laptop But It's Fine, Everything's Fine - Gizmodo [Last Updated On: March 18th, 2017] [Originally Added On: March 18th, 2017]
- Google Cloud adds new customer-supplied encryption key partners ... - ZDNet [Last Updated On: March 18th, 2017] [Originally Added On: March 18th, 2017]
- Preseeding Full Disk Encryption - Linux Journal [Last Updated On: March 18th, 2017] [Originally Added On: March 18th, 2017]
- Bypassing encryption: 'Lawful hacking' is the next frontier of law enforcement technology - Boston Business Journal [Last Updated On: March 18th, 2017] [Originally Added On: March 18th, 2017]
- SecurityBrief NZ - Gemalto introduces on-prem encryption key solution for 'highly regulated' organisations - SecurityBrief NZ [Last Updated On: March 21st, 2017] [Originally Added On: March 21st, 2017]
- 'Always Be Concerned': US Court Slaps Down Fifth Amendment Defense of Encryption - Sputnik International [Last Updated On: March 21st, 2017] [Originally Added On: March 21st, 2017]
- Quantum Key System Uses Unbreakable Light-Based Encryption to Secure Data - Photonics.com [Last Updated On: March 21st, 2017] [Originally Added On: March 21st, 2017]
- Wikileaks Only Told You Half The Story -- Why Encryption Matters More Than Ever - Forbes [Last Updated On: March 21st, 2017] [Originally Added On: March 21st, 2017]
- EPA Sued For Withholding Info On Encrypted Text Messages | The ... - Daily Caller [Last Updated On: March 22nd, 2017] [Originally Added On: March 22nd, 2017]
- Opinion Data encryption efforts ramp up in face of growing security threats - Information Management [Last Updated On: March 22nd, 2017] [Originally Added On: March 22nd, 2017]
- Bypassing encryption: Lawful hacking is the next frontier of law enforcement technology - Salon [Last Updated On: March 22nd, 2017] [Originally Added On: March 22nd, 2017]
- NeuVector Announces Container Visualization, Encryption, and Security Solution for NGINX Plus - DABCC.com [Last Updated On: March 23rd, 2017] [Originally Added On: March 23rd, 2017]
- Is encryption one of the required HIPAA implementation specifications? - TechTarget [Last Updated On: March 23rd, 2017] [Originally Added On: March 23rd, 2017]
- Paper Spells Out Tech, Legal Options for Encryption Workarounds - Threatpost [Last Updated On: March 23rd, 2017] [Originally Added On: March 23rd, 2017]
- Encryption debate needs to be nuanced, says FBI's Comey - TechTarget [Last Updated On: March 25th, 2017] [Originally Added On: March 25th, 2017]
- Comey Renews Debate Over Encryption - 550 KTSA [Last Updated On: March 25th, 2017] [Originally Added On: March 25th, 2017]
- UK minister says encryption on messaging services is unacceptable - Reuters [Last Updated On: March 28th, 2017] [Originally Added On: March 28th, 2017]
- The why and how of encrypting files on your Android smartphone - Phoenix Sun [Last Updated On: March 28th, 2017] [Originally Added On: March 28th, 2017]
- UK targets WhatsApp encryption after London attack - Yahoo News [Last Updated On: March 28th, 2017] [Originally Added On: March 28th, 2017]
- Critical flaw alert! Stop using JSON encryption | InfoWorld - InfoWorld [Last Updated On: March 28th, 2017] [Originally Added On: March 28th, 2017]
- SecureMyEmail is email encryption for everyone - TechRepublic - TechRepublic [Last Updated On: March 28th, 2017] [Originally Added On: March 28th, 2017]
- Apple iOS 10.3 will introduce encryption which makes it MORE difficult for cops and spooks to crack into ISIS nuts ... - The Sun [Last Updated On: March 29th, 2017] [Originally Added On: March 29th, 2017]
- How to Analyze An Encryption Access Proposal - Freedom to Tinker [Last Updated On: March 29th, 2017] [Originally Added On: March 29th, 2017]
- Questions for the FBI on Encryption Mandates - Freedom to Tinker [Last Updated On: March 29th, 2017] [Originally Added On: March 29th, 2017]
- Justice Department anti-terror chief keeps pressing on encryption - Politico (blog) [Last Updated On: March 29th, 2017] [Originally Added On: March 29th, 2017]
- UK government can force encryption removal, but fears losing, experts say - The Guardian [Last Updated On: March 29th, 2017] [Originally Added On: March 29th, 2017]
- Encryption FAQs [Last Updated On: March 29th, 2017] [Originally Added On: March 29th, 2017]
- Why isn't US military email protected by standard encryption tech? - Naked Security [Last Updated On: April 9th, 2017] [Originally Added On: April 9th, 2017]
- How have ARM TrustZone flaws affected Android encryption? - TechTarget [Last Updated On: April 9th, 2017] [Originally Added On: April 9th, 2017]
- Keeping the enterprise secure in the age of mass encryption - Information Age [Last Updated On: April 9th, 2017] [Originally Added On: April 9th, 2017]
- Lack of encryption led to Dallas siren hack - WFAA [Last Updated On: April 12th, 2017] [Originally Added On: April 12th, 2017]
- Internet Society tells G20 nations: The web must be fully encrypted - The Register [Last Updated On: April 12th, 2017] [Originally Added On: April 12th, 2017]
- Make Encryption Ubiquitous, Says Internet Society - Infosecurity ... - Infosecurity Magazine [Last Updated On: April 12th, 2017] [Originally Added On: April 12th, 2017]
- Can we encrypt the web while giving governments a backdoor to snoop? - SC Magazine UK [Last Updated On: April 12th, 2017] [Originally Added On: April 12th, 2017]
- Why we need to encrypt everything - InfoWorld [Last Updated On: April 12th, 2017] [Originally Added On: April 12th, 2017]
- Hacked Dallas sirens get extra encryption to fend off future attacks - Computerworld [Last Updated On: April 12th, 2017] [Originally Added On: April 12th, 2017]
- SHA-1 Encryption Has Been Broken: Now What? - Forbes [Last Updated On: April 14th, 2017] [Originally Added On: April 14th, 2017]
- Hewlett Packard Enterprise touts encryption tool for federal clients - The Hill [Last Updated On: April 14th, 2017] [Originally Added On: April 14th, 2017]
- Encryption on the Rise in Age of Cloud - Infosecurity Magazine - Infosecurity Magazine [Last Updated On: April 14th, 2017] [Originally Added On: April 14th, 2017]
- Lawmaker Pushes Bill That Requires Encryption by Pennsylvania State Employees - Government Technology [Last Updated On: April 14th, 2017] [Originally Added On: April 14th, 2017]
- Disk encryption - Wikipedia [Last Updated On: April 14th, 2017] [Originally Added On: April 14th, 2017]
- The apps to use if you want to keep your messages private - Recode [Last Updated On: April 15th, 2017] [Originally Added On: April 15th, 2017]