aws

Data Security with AWS Key Management Service – Part I

When using cloud technologies, security and cost are two of the most important factors for every organization. To help users in securing data, cloud providers have created few services out of which managing keys for encryption, digital signing and verification is an important part e.g. AWS Key Management Service, GCP Cloud Key Management, and Azure Key Vault.

KMS being an early adopter (released in 2014) and first of its kind, has few interesting patterns and a good case study to understand data protection in the cloud. The usage of the word “security” in this article refers to protecting data and keys through symmetric key encryption.

Before we proceed, let us have an understanding of two very important phrases from a data security perspective.

  • Securing Data at Rest – To ensure that data saved on disks or any storage device is encrypted. No plaintext copy of the data exists in memory. This encrypted stored data can only be read using a valid key.
  • Securing Data in Transit – To ensure that the network channel between source and destination is secured and the payload is encrypted. It can only be read at the destination with a valid certification.

Key Types in AWS KMS

1. Customer Master Key (CMK) – The building block and most important component, AWS defines it as a logical representation of the master key. The CMK contains metadata, key ID, date, description and state. The underlying key represented by CMK is never shared, revealed or leaves KMS unencrypted. Only when a user imports a key as CMK, they know the key but after importing it will remain in AWS and won’t be shared in plaintext to anyone (even the user who created it). CMK has a limit when it comes to encryption, it can only encrypt text up to 4 KB.

A CMK is created either by a user or by AWS (on behalf of AWS services). A Customer managed CMK is created, owned and managed by users. On the other hand, we have AWS managed CMK which is created, owned and managed by AWS services on your behalf e.g. S3 uses “aws/s3” alias CMK to encrypt data if you enable default encryption using SSE-S3 (server-side encryption). There are two primary usages of CMK, Generate data keys and Encrypt/Decrypt data keys.

2. Data Key – The key which is used for encryption of data (any data size). This key after creation, is returned to the user but isn’t stored by AWS. Most importantly, KMS doesn’t perform any cryptographic operations using data key. It is the sole responsibility of developers to encrypt plain text and manage the data key used for encryption.

If by any chance this data key is lost then the encrypted text can’t be recovered. Even if the data key is compromised, it doesn’t mean CMK is also compromised. The CMK is still secure irrespective of the data key breach because it is controlled by IAM policies and audited through CloudTrail. AWS doesn’t track or control access to data key.

Each of the above keys is further classified as Symmetric and Asymmetric Key. Following table compares few differences between these two. Please visit this AWS doc for a detailed comparison.

Symmetric (Single 256 bit Key)Asymmetric (Public/Private Key Pair)
Master KeyOnly for encryption and decryption (max 4 KB)Either encryption/decryption OR signing/verification but not both
Master Key never leaves KMSPublic key can be downloaded but Private Key never leaves KMS
Generate Data Key, GenerateDataKeyPairNot supported
Must use KMS API for cryptographic operationsNo such restrictions (use KMS API or outside AWS KMS)
Data KeyEncrypt/Decrypt outside of AWS KMS (no limit on plain text)Use outside of AWS KMS to encrypt/decrypt or sign/verify
AES-256 keyRSA or Elliptic curve key pairs
AWS KMS – Symmetric vs Asymmetric Key

Many AWS services are integrated with KMS and use CMK to protect data both at rest and in transit. These operations are inbuilt and allow users to offload this activity to be done transparently. As of now, these services use symmetric CMKs for encryption. If your use case doesn’t require lot of CMK (it costs $1/month/key) or client-side encryption then prefer using AWS managed services for encryption on your behalf instead of do-it-yourself.

Symmetric encryption process in KMS

Encryption in KMS using Symmetric data key

Encryption occurs outside KMS, either using AWS Encryption SDK or any compatible library. Encrypted data key needn’t be stored together with encrypted text, it can be stored externally e.g. S3 or DynamoDB. However, developers must track the key used for encryption else a lost or forgotten key results in no way to retrieve plaintext from encrypted text. This process is also called envelope encryption, encrypting plaintext data with a data key, and then encrypting the data key under another key.

Symmetric decryption process in KMS

Decryption in KMS using Symmetric data key

Similar to encryption, decryption is also a 2-step process where we need to first retrieve the plaintext version of the data key from KMS. Every data key generated by KMS is unique, thus passing a different encrypted data key will not decrypt the data.

Let us take an example of how S3 encrypts and decrypts data on behalf of a user. Though there are multiple ways to protect data in S3, we will focus on SSE-S3 and SSE-KMS.

  1. S3 requests KMS to create a new data key using KMS’s GenerateDataKey API. In return, it gets two copies of a data key, one is plaintext and another one is encrypted using same CMK (default “aws/s3” or user-imported)

2. It proceeds to encrypt the S3 object using the plain text data key and stores the object as encrypted only.

3. Finally, S3 will remove the plaintext data key from memory ASAP and store the encrypted data key in the object’s metadata. If using default encryption (SSE-S3) then this data key is not shared with users. In the case of SSE-KMS it can be viewed by reading the value of “SSEKMSKeyId” from the object’s metadata when calling HEAD or GET operation.

Not every service requests two copies of the data key. Amazon Elastic Block Store (EBS) uses a different approach. EBS will only request an encrypted version of the data key using GenerateDataKeyWithoutPlaintext and stores it in volume metadata. It delegates obtaining the plain text data key until the volume is attached to an EC2 instance. The plaintext data key resides in hypervisor memory to encrypt disk I/O to the volume and persists in memory as long as the volume is attached to the instance.

In the second part of this blog, I will discuss asymmetric encryption and signing/verification.

Thanks for reading, welcome your feedback.

1 reply »

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s