Using MySQL's Built-in Encryption: A Terrible Idea
6 Reasons To Avoid MySQL’s Native Encryption Functions
This past week we had a conversation with an organization who was considering doing application-layer encryption with MySQL using MySQL’s native AES_ENCRYPT
and AES_DECRYPT
functionality and they asked for our opinion.
The first question to look at is whether this is really application layer encryption (ALE). The idea of ALE is to encrypt before storing with the idea that someone gaining access to the data store can only see garbage data. With infrastructure-layer encryption, the low-level storage is protected if a hard drive is stolen, but anyone able to query the data store gets back unencrypted data. This approach with MySQL involves sending unencrypted data and keys to the database where the database can encrypt or decrypt the data, but doesn’t store the keys. So if someone who doesn’t have access to the keys gets access to MySQL, they have an ALE-like experience where the data is garbage.
So is this ALE? Not really, but it does largely accomplish the same thing and it may be sufficient for the security use cases that drive adoption of ALE. I’d call it an often-acceptable approximation of ALE.
In this specific case, whether it’s ALE or not is irrelevant because MySQL’s encryption capabilities are garbage and using them is unwise for most or likely all use cases. Here’s why:
1. Default modes and key sizes are insecure
By default, MySQL uses AES in ECB mode with a 128-bit key.
1.a. ECB mode is literally the worst
ECB mode is basically the worst possible choice for AES for any possible use case. No cryptographer would advise using it in 2024 and 100% of cryptographers would warn against it because it’s susceptible to so many attacks.
ECB is block level and has no IV or other random element. The same key used to encrypt the same block produces the same result every time.
This is per-block deterministic encryption implemented in an insecure way. If you want to do deterministic encryption, there are better ways, such as AES-SIV, which uses synthetic IVs and defends against numerous attacks.
1.b. Other available modes are also bad
“Ah!” You might say. “We can just change some settings and then MySQL encryption will be good!”
Not so fast. First, you should be suspicious of any tech stack that defaults to bad security. And if you look at MySQL’s other options, they aren’t a lot better. They offer CFB1/CFB8/CFB128 which is so outdated and unacceptable as to be laughable. They offer OFB, which is patented in the U.S. and thoroughly entangled. And they offer CBC, which is probably the best choice of what they offer, but is unauthenticated and still crap compared to AES-CTR or AES-GCM, which is the authenticated version of AES-CTR that prevents attackers from tampering with the ciphertext.
But MySQL doesn’t explain the details of how it encrypts. It doesn’t say what padding it uses, for example, which makes the security of the data questionable and the portability of the data (your ability to decrypt it outside of MySQL) difficult. We’ve seen Internet threads where people tried to reverse engineer ways to get at the encrypted data and lose days to the effort, sometimes failing completely.
For this reason alone, we would never use MySQL encryption.
1.c. 128-bit key
Quantum computers are here, though they aren’t yet powerful enough to break modern cryptography. It’s likely that in another decade – maybe more or less – someone will use a quantum computer to break current public key cryptography (RSA and Elliptic Curve).
Quantum computers, to the best of our knowledge, won’t break symmetric cryptography, but they do weaken it. Grover’s algorithm changes the complexity of the attacks against symmetric ciphers from O(n)
to O(sqrt(n))
, which means the size of key needed to be theoretically safe in the face of future quantum computers is 256-bits.
Here’s a talk we gave for Gartner on this topic, which puts these problems into perspective explaining expected time for quantum computers to become sufficiently powerful and how powerful they need to be to break modern cryptography. It was very well received: Post-Quantum Cryptography Explained
The good news here is that although the default key size is a poor choice, you can fix the problem in the MySQL configuration.
2. Inability to evolve and change algorithms and key sizes once in use
But what happens if you start using MySQL’s encryption and then you only later realize you need to use a different mode or key size? Or if next year the recommended minimums change?
In MySQL, there’s no mechanism for changing key sizes or algorithms after you’ve already started encrypting. You can’t call the AES_ENCRYPT
function and specify the algorithm to use because that choice is a system environment option, which means you can’t decrypt and then re-encrypt in a single statement.
If you can’t decrypt outside of MySQL (see above note on why this is non-trivial), then you have to decrypt the data in the database, store it decrypted, change the global configuration, then encrypt the data again. And that’s super bad. Perhaps you could hack in a TEMP table, or set up a parallel instance of the database and read from one and write to the other? Yuck. It’s all duct tape and bubble gum from here, folks.
The fact is, recommendations on what algorithms to use and what key sizes to use in cryptography evolve over time. In the next decade, because of the threat of quantum computers, they’ll likely evolve and change much more rapidly than in the previous two decades. It’s critically important that any cryptography being used is set up to be crypto-agile so the system can change and evolve as standards and norms do. See our blog post on the importance of crypto-agility to get more of a grasp on what this means.
3. Secrets get logged to disk in plain text
MySQL query logging is a thing. It’s common to automatically log slow performing queries, for example. But logging in MySQL when using MySQL’s encryption capabilities means you’ll leak sensitive data and keys to disk.
The MySQL encryption documentation page advises users to “encrypt sensitive values on the client side before sending them to the server” specifically to avoid the logging of plain text values and keys.
Betting that your operations team and developers will never turn on query logging is doomed to fail. At some point, logs will be needed to debug a problem and then all data security will be undermined. And this is a common problem even at big companies with careful procedures. For example, plain passwords submitted on login have repeatedly made it into production logs at companies like Twitter, GitHub, and Facebook.
4. Hurts database performance and scalability
MySQL is not a horizontally scaling database. You can shard, but that can’t always solve your problem. Over time you have to throw more and more hardware (especially memory, but also CPU) at the database to keep your queries from tipping it over.
Adding MySQL-side encryption to the mix will make this problem much, much worse.
First, you’re using your database processing time for doing the encryption. MySQL doesn’t support encryption that’s hardware accelerated (Intel chips have native acceleration for AES-GCM built-in, for example, but there’s no acceleration for any of the modes offered by MySQL) so it will be all CPU.
Second, queries will be slower with encrypt/decrypt calls than without them, which can have a knock-on effect causing stack-ups of connection requests and timeouts. Once you’ve gone down the path of encrypting inside MySQL and start to have problems, backing out will be an enormous pain.
Third, a single ill-conceived query can hose everything. The main attraction to doing encryption inside MySQL is the idea that you can still use the power of SQL over the encrypted data (sort of). It looks attractive to be able to do a LIKE
query over encrypted data – that’s ALE and it’s still usable, right? The query might look like this:
SELECT CAST(AES_DECRYPT(mycolumn, ‘key’) AS CHAR(50)) mycolumn_decrypted FROM mytable HAVING mycolumn_decrypted LIKE ‘%needle%‘;
For this query to work, you need column level encryption (one key for all the data in the column) and shared IVs (if you’re in a non-ECB mode and even have IVs). Reusing IVs, which you’re now forced to do, is a very bad idea so the security of your encryption is not great now.
But on top of that, you have to consider the performance here. You won’t be able to use an index (see next section), so it will be doing a full table scan every time, and each encrypted field in the selected rows must be separately decrypted sequentially in a single thread. If your table has many rows, this is going to take a while.
And if you have many rows, this can kill your database performance.
So engineers might like that in theory they can still do this kind of query, but allowing this sort of query is actually bad. It’s probably better not to even be tempted by the possibility.
As much as doing things the way you always have seems desirable because it’s familiar and convenient, that doesn’t make it wise. There are other ways to accomplish the same things that are far more performant over ALE data. With the IronCore Labs platform, for example, you can parallelize decryption and it will use hardware acceleration so that you can quickly rip through lots of data without tying up database resources.
5. Indexing removes security and leaves security theater
MySQL does not index encrypted fields, but that doesn’t mean enterprising developers can’t find a way. Technically, you can use a function key to index the unencrypted data, which could potentially unlock a lot of queries and speed them up. But doing this would completely undermine the encryption making the entire exercise pointless.
Here’s what’s happening if you use a function key to index the plaintext: the unencrypted values are getting stored. If an attacker can get direct access to the index, they get access to the values. But even if the attacker only has access to queries, they still can extract the values. Consider the following queries where mycolumn
is encrypted:
SELECT mycolumn FROM mytable WHERE mycolumn LIKE 'a%';
SELECT mycolumn FROM mytable WHERE mycolumn LIKE 'b%';
-- ...
-- it doesn't matter that the returned value is encrypted; the
-- attacker now knows which columns start with which letters
SELECT mycolumn FROM mytable WHERE mycolumn LIKE 'aa%';
SELECT mycolumn FROM mytable WHERE mycolumn LIKE 'ab%';
-- ...
-- and now the attacker has extracted the first two letters
-- no keys required to reverse entire values and entire columns
Without a key or any call to decrypt, the attacker can extract values out of the encrypted column of the database.
6. Unsafe for statement-based replication
There are different ways to replicate MySQL databases, but if you use statement-based replication, you cannot safely use the MySQL AES_ENCRYPT
and AES_DECRYPT
modes as the sensitive data gets insecurely propagated.
MySQL says as much in a throwaway line buried in their documentation: “Statements that use AES_ENCRYPT() or AES_DECRYPT() are unsafe for statement-based replication”.
Why we’re publishing this and the bottom line
If you want to use application-layer encryption with MySQL, the idea of being able to do arbitrary queries such as LIKE
queries over the encrypted data may seem alluring, but in this case, you are left with bad and worse trade-offs.
I hesitate to ever tear down other encryption products even if they could be better or when there are much better options. That’s because we in security, and especially those of us in cryptography, have a habit of highlighting imperfections to the point where outsiders can’t tell the difference between “imperfect” and “insecure”.
In general, I believe it’s better if people encrypt their data than if they don’t. The idea that someone would leave their data completely unprotected in preference to keeping it somewhat protected is irksome to me. To get from the state of the industry today, where very little data is encrypted above the infrastructure / disk layers, to a place where sensitive data is commonly encrypted at higher levels, we’re going to have to take incremental steps.
Yet I haven’t gone gentle on MySQL’s offering. MySQL is a wonderful product and it feels bad to shoot arrows at it. But no one should use this encryption functionality in MySQL. It’s not just a flawed implementation with myriad pitfalls, it’s also a generally bad idea across the board.
Using these functions sets your application up for failure as you gain success and your database grows and as time and needs evolve. It’s a tar trap that will not be easy to escape and it will lead to overspending on database hardware while under-delivering severely on the promise of data security.
A good platform for ALE like IronCore’s SaaS Shield handles all of these issues, keeps you crypto-agile, and gives ways to use the data while it’s encrypted. It also uses per-row encryption, handles the keys, and makes sure IVs are never reused. Platforms like ours handle the performance, security, scalability, usability, and other concerns of real data protection inside cloud applications. MySQL’s native encryption is still a little bit better than nothing, but unless you really don’t have any other choice, it is best avoided. And you do have choices.