How we securely generate sensitive secrets

Secrets are everywhere. Whether it’s the private key that lets you authenticate with an SSH server, the credential that grants you powers in AWS, or the password for your Minecraft account, you need some way to securely generate and manage it.

We’ve talked before about how we store secrets and how we use them to delegate trust and confer sensitive privileges. In this post we’ll cover how we’re making use of AWS Nitro Enclaves to securely and verifiably perform sensitive operations (like generating secrets).

It’s not paranoia if they’re really out to get you…

When we create a secret we need to have trust that no one was able to compromise or manipulate it. In other words, we want:

Confidentiality — no one was able to peek at the secret.
Integrity — the secret was not changed after being created.
Authenticity — the secret was generated using a specific program running in a specific environment with specific privileges.

We should be able to verify that we have these properties despite a lone adversary who has complete knowledge of the entire process and can read and modify all artefacts that are inputs and outputs of the process.

What are some existing approaches?

Suppose we use a password manager to generate and store a password that’s long enough to be completely unguessable. So far so good… but then what? Ordinarily you’d copy and paste it into the service’s website — but this potentially leaks it to a program running on your computer. Also, what if the password manager includes a vulnerable dependency?

An alternative is to use a hardware credential that implements a passkey for authentication in addition to a password. That would mean the secret is generated inside an isolated environment and is used to sign authentication proofs instead of exposing the secret to your computer and sending it over the internet. This also means we need to place less trust in a single security control, which is always a win!

However this approach only works locally, and only with certain kinds of secrets. Suppose we have a cloud service that connects to a third-party to exchange sensitive user information — we need a very high degree of trust in the fact that it has exclusive knowledge of its secrets, as otherwise we wouldn’t have confidence in the safety of our customers!

How can we achieve this when we can’t even physically access the service to plug something into it? Well, if we had some kind of secure environment which we trust but can also verify, we might be able to use it to generate secrets without exposing them anywhere else. 🤔

A trustworthy environment

AWS Nitro Enclaves are isolated execution environments that are hardened and highly-constrained virtual machines. They have no persistent storage or interactive access, and nothing outside can access anything inside unless it’s explicitly sent out by the enclave. We’ve previously written about how we make use of enclaves to confer sensitive platform privileges.

Crucially though, Enclaves support an attestation feature that can produce a signed fingerprint (cryptographic hash) of the exact environment and code that is running inside of it. That means we can verify that the code we’re running is precisely the code that we intended to run, and in particular we can retrieve cryptographic proof of any tampering.

However, this all depends on knowing what the right expected fingerprints are for the code that we want to run, and that requires a reproducible build process. The reasoning is simple: if I compile a program then I need to be able to prove that the input code was legitimate and that I didn’t modify anything along the way. To do this, I calculate a fingerprint of the compiled binary and distribute it to others. If someone wants to verify my claims, provided that the build process is deterministic, they can compile the same code and check that their fingerprint is the same as mine.

That’s why as part of this work we’ve created an independent, Nix-based, from-source, deterministic build process for Nitro Enclave images, and we’ve open sourced it for everyone to benefit from! You can read more about that work here.

Doing the thing

To actually generate a secret, we start with a Go program that performs the necessary steps; some simple examples would be reading randomness from a cryptographic source, or generating an SSH key pair.

Once the program’s code is reviewed by a security engineer and merged into the main branch of our repository, our CI kicks in and deterministically builds an enclave image file containing the compiled code. Another human also builds the image locally and compares the fingerprint of their output with what the CI reported — that’s how we ensure the CI has not maliciously modified the program in any way. 🕵️‍♂️

Both Bob and Concourse CI build the same code and so they should agree on the fingerprints of the enclave image. Once the enclave runs and produces some outputs, Bob verifies that his fingerprints match the ones included in the outputs.

We then launch the Nitro Enclave on a dedicated AWS EC2 instance with instructions to run the image. It begins by requesting a signed attestation certificate from AWS which contains PCR hashes representing the exact environment and code the enclave is running.

More specifically, we care about PCR0, PCR1, and PCR2, which are effectively fingerprints of the image’s runtime and kernel, and also PCR3 which is a measurement of the IAM role attached to the EC2 instance that’s running the enclave. When we later validate the attestation document, we check that these PCRs match what we expect, and if they’re not the same we know something fishy is going on 🧐. You can read more about how PCRs are calculated in this blog post by Trail of Bits.

The enclave also has exclusive ownership of an AWS KMS key pair which it uses to decrypt any sensitive inputs that it might need while running, for example credentials or long-term identity keys. When we launch the enclave we configure this key to only allow decryption if the request contains a signed attestation certificate with some specific PCR values. This ensures that only the program we wrote is allowed to decrypt sensitive secrets, and acts as a defence-in-depth control against unauthorised enclaves — as any workload that wants access to the key would first need to update the key policy to grant itself permissions. Updating a key policy is a loud action that would trigger alerts and invite scrutiny. 🚨

Finally the enclave runs through the steps in the program and then prepares any outputs for export. Sensitive outputs like private keys need to be protected before being exported — this is accomplished by encrypting them using a public key (for example one belonging to KMS or Vault) or writing them directly to the target service over TLS with a custom trust store.

Verifying the outputs

Once the enclave has its final set of outputs, it needs to sign them so that we’re able to validate that they were in fact created how we intended. To do this, we generate an ephemeral key pair and request a Nitro attestation document containing our public key from AWS, and then use the private key to create a signature. Since this key pair is lost forever once the enclave terminates, there’s no way for anyone else to re-use it to sign something else.

Chain of trust that needs to be followed to validate the authenticity of outputs.

To validate the outputs, we traverse this chain of trust and verify that each arrow is a valid signature, until we reach the AWS root certificate authority which we can verify using a local copy of the certificate. If any of the outputs were changed after they were generated, or if they were generated by a different enclave image, or even if they were created using the same image but using a different AWS IAM role, this verification would fail because either a signature wouldn’t be valid or the PCR values in the attestation would be different to the ones we expect.

A CLI being run that fetches and validates the PCR outputs and the entire chain of trust

Retrieving the generated artefacts is as simple as running a single command which will automatically verify the entire chain of trust and validate all the PCR values. At the moment engineers have to manually enter some of these PCRs, but eventually this’ll be replaced with another local build. 🔧

Putting it all together

We’re now using this system in production to generate high security secrets that confer sensitive privileges to various components within our platform.

This has taken what was a very manual process that involved at least 3 person-hours of meticulously working through many tedious steps, to something that takes a single person about 5 minutes, and in the process we’ve replaced process-driven assurances with cryptographic ones.

We intend to expand the kinds of sensitive operations we perform with this setup, so stay tuned for future updates! 👀

If this interests you, we are looking for Backend Engineers and a Director of Engineering for our Security Collective