Misadventures in TDE: How to Restore an Encrypted SQL Server Database Completely Wrong
Doing things by the book isn’t necessarily the only way to learn a new skill. Sometimes, experiencing everything that can possibly go wrong when you don’t follow the book can leave a more lasting impression.
I had that misfortune in December, when I tried to port some of my SQL Server 2008 R2 databases over to SQL Server 2012 Release Candidate 0. Long before that, I had encrypted one of them using Transparent Database Encryption (TDE), which is normally effective and easy to use – if you pay attention to the directions in Books Online. For brevity’s sake, I will leave out most of the details on how to backup and restore a TDE-encrypted database, because there are plenty of tutorials in Books Online and on the Web on that subject. Check those sources if you want the usual info on how restore Service Master keys and all that. What I will present here is a short list of things to avoid when backing up and restoring a TDE database, which I can verify from my own experience. On the day all of these mistakes occurred, I may have invented some new swear words, and hope that my cautionary tale will help other DBAs avoid inventing their own. I might as well coin another phrase though to describe this type of article: it’s a mistutorial, for lack of a better word. Have you ever heard the expression, “Don’t try this at home?” Well, I did, and it’s a good thing that I didn’t try this on a production system in an important institution, otherwise I would be in an institution of a completely different kind now. That brings me to my first lesson:
Mistake #1: Once you’ve encrypted any databases, don’t get out of practice with TDE. None of the following would have happened if I had not memorized everything pertaining to TDE when it first came out, only to forget half of it in the following years. When I suddenly needed to recall it all in a hurry, I couldn’t, because my brief fling as a TDE guru ended back when R2 first became available and I encrypted this practice database for the first time. In fact, I may have actually encrypted it back when I got my developer edition of the first release of SQL Server 2008. At the time, I practiced backing up and restoring TDE certificates until I was blue in the face, but my lack of recall of exactly when that was shows just how lax I had become. The earlier practice and book knowledge that went along with it did come in handy, but I shouldn’t have let my skills become rusty in this particular area. Instant recall of the arcane inner workings of MERGE clauses, input parameters of undocumented DBCCs and the significance of DDL_SERVER_LEVEL_EVENTS is all very useful, like a lot of other SQL Server knowledge, but TDE amnesia can lead to sudden disaster. It is a thing that must be mastered if you’re going to use it. It’s like Mr. Miyagi when he explained to Daniel-San: “Karate you do yes, or karate you do no – your karate so-so, and like bug in middle of road, you get run over.” In this case, it took me several hours to restore a practice database to a Release Candidate server on my home computer. If the same mistake had occurred in a production environment with critical data, I would have been squished like Miyagi’s bug.
Imagine, for example, that you have to immediately restore your most important database, the one your organization depends on the most. Or perhaps you have to transfer it to another server in a hurry. You have easy access to multiple backups of your TDE certificates and assume that the transfer process will be equivalent to the time it takes you to type out the passwords for your .pvk file. Then you get the following mysterious error:
Mistake #2: When trying to attach a TDE-encrypted database, you get an error like this: Cannot find server certificate with thumbprint ’0xBlahBlahAHexadecimalCodeAbout40DigitsLong′.
This error stumped me for hours. I checked sys.certificates repeatedly and found just one TDE certificate associated with the database, which was operating just fine on my old R2 server. If the right certificate wasn’t available, how could I access the database at all? Naturally, I assumed that I had accidentally restored the wrong backup, possibly of a different certificate I had practiced on many moons before – since it had been so long since the original encryption, I couldn’t recall if I had applied and removed TDE before for the sake of rehearsal or something. None of my backups worked though, so I simply exported the TDE certificate from the R2 server again, thanking the stars above that I hadn’t had a major crash during the many months when I didn’t have a proper certificate backup.
To my astonishment, restoring the new backup of a certificate that was obviously working on the R2 server gave me the same thumbprint error on the new RC 0 server. I double-checked to make sure the thumbprint for the new backup matched the thumbprint column in sys.certificates and it was identical. Apparently, the same certificate worked on one server but not the other. Since the second server was merely a release candidate I thought maybe there was some unresolved bug in 2012, but instinct told me something was rotten in Denmark. I had made a mistake somewhere.
It turned out to be a simple but time-consuming error on my part, which would have never happened without the laxness in Mistake #1. I had been checking sys.certificates in the encrypted database, not in Master, where the operative certificate resided; all of this time, the database had been running smoothly on the R2 server because it was using a different certificate than the one I was exporting. So if you receive a thumbprint error like this, first make sure that you’re not adding the certificate from the wrong backup. If you export a new backup and the thumbprint error persists, make sure you’re exporting the one listed in sys.certificates when Master is the active database and not when the TDE-encrypted database is active. Of course, neither of these would be a problem if you avoid the following:
Mistake #3: Make sure you export the right certificate in the first place. Also clean any unnecessary certificates and their backups from your database and your backup folders.
The certificate listed in sys.certificates when the TDE-encrypted database was in use no longer had a purpose. It must have served a purpose at some point more than a year and a half prior, but I hadn’t discarded it, which led to future confusion. To make matters worse, I had multiple backups of that certificate in my backup folders, which led me to waste more time trying to restore the wrong ones to the new server. Furthermore, the presence of these extra backups of the wrong certificate gave me a sense of false security; I assumed my database was protected in the event of a crash, but it wasn’t. Even if the right certificate was available, I wouldn’t have been able to find it in a hurry anyways, thanks to this old clutter.
Mistake #4: You receive an error like this when backing up a certificate: “No decryption password should be provided because the private key of this certificate is encrypted by a master key”.
This error added a few extra minutes to the fiasco. I have repressed most of my memories of the event, but I think this occurred when I had found the right certificate but could not export it using the same code as I had used for backing up the wrong certificate. The solution was to check the pvt_key_encryption_type_desc column in sys.certificates. If the value for the correct certificate is ENCRYPTED_BY_MASTER_KEY, then you have to export it with a private key. The wrong certificate had a value of NO_PRIVATE_KEY, so I was able to successfully back that one up without a private key repeatedly that night. It was the wrong method to use on the right key though.
Mistake #5: When you use issue a CREATE CERTIFICATE command on the second server, you get the following message: “An error occurred during the generation of the certificate.”
I also experienced this error a couple of times. If I remember correctly, I received it for two distinctly different reasons: first I typed the file names in the command incorrectly and secondly, I used the same file for both the certificate and the private key. The latter part I am unsure about since it happened almost two months ago and I am trying to forget the episode, so as to avoid a fit of post traumatic stress or something – but if fixing the file names doesn’t work for you, check to see if the certificate and private key files are supposed to be different.
Mistake #6: “The certificate, asymmetric key, or private key file is not valid or does not exist; or you do not have permissions for it”
This error likewise has an easy fix, once you know to look for it. I lost more time tracking down the cause, however, which I learned from numerous posts on the Internet from other SQL Server users who had experienced the same problem. Here’s a quick fix for the issue: temporarily change the file permissions so that all Users can access the certificate backup files. You may want to change them back after the restore or attach operation is complete, however, to prevent unauthorized access by other members of the Users group. Unfortunately, I had so many useless backups of the wrong certificates that I lost more time applying these permissions to the wrong files.
I may have experienced more errors than this; these are the only ones I can recall from my notes though. Now that I have succeeded in forgetting the rest of the episode, the orderlies here are finally letting me remove my strait jacket (which comes in handy when typing blog posts like this). The men in white coats tell me I will recover – unlike the guy in the padded cell next to me, who had the same thing happen to a production database. They’ve prescribed a few minutes of practice with TDE on a regular basis, in the same way I would brush my teeth or take the garbage out regularly. If you’re going to use TDE, you need to be able to do it in your sleep – otherwise, some day it’s guaranteed to keep you awake, once disaster strikes and you discover that you can’t recover your encrypted databases as fast as you assumed.
Stay tuned for next month’s batch of mistutorials, beginning with Thank God I Chose SQL Server part I: The Tribulations of a DB2 Trial, followed by Thank God I Chose SQL Server part II: How to Improperly Install Oracle 11g.