I am looking for a way to safely store personal information with low entropy saf
ID: 662025 • Letter: I
Question
I am looking for a way to safely store personal information with low entropy safely.
I have the following requirements for the data:
- Must be able to search (i.e. to look up an existing piece of data) but not view
- Other systems must be able to recover the real value
- The system must be reasonably well performant (options in seconds not hours)
I think a system of encrypting the data using a public key is my best option. I can keep the private key offline so the individual value cannot be directly recovered. However I think that an attacker could use the encryption process as an oracle and recover the data due to its low entropy.
Any ideas on how to improve the security of this system? Not collecting this data is not an option. There will be additional layers around this data (access control, logging, physical security, etc) so I am just focused on this part of the system.
Explanation / Answer
What you're looking for is deterministic encryption: that the same value encrypted twice gives the same output. Given deterministic encryption with a key K, an attacker would need the key to determine which SSN maps to which encrypted value. You can still perform searches on the deterministically encrypted data, but only equivalency comparisons (==, !=).
Examples of deterministic crypto that would work:
What won't work:
Note that, in all cases, you are giving up ciphertext indistinguishability, but that's a core requirement of being able to search on the ciphertexts.
You do need a mechanism to share the key with other systems that need access to the plaintext, but an attacker who gains access to a database backup, SQL injection, or any other attack that gives access only to the database won't be able to discern the plaintexts.
PKI is not useful here, as you point out, as having the public key allows to enumerate the values and recover them, if you're using a deterministic PKI cryptosystem (plain, unpadded, RSA, for example). Using a non-deterministic PKI (padded RSA) will not allow you to search on the ciphertexts.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.