Docker containers are treated as ephemeral, specifically when they are managed in a Kubernetes cluster. Starting and restarting in the cluster is done automatically and works as a breeze. Things get more complicated as soon as you decide to keep state in your cluster. This is typically done by attaching volumes into a cluster (writing to a host mounted volume is an absolute no-no, except for experimental purposes).
On AWS you would typically use dynamic PersistantVolumes bound to EBS or EFS. Once you attach a volume its AWS volume ID gets tracked by Kubernetes.
In the output below you can see that the volume ID is aws://eu-central-1c/vol-a559b271b5fca7dfa:
$ kubectl get pv pvc-a0f6a3ea-ab5a-a1ea-a2ba-a2809894a24a -o yaml apiVersion: v1 kind: PersistentVolume metadata: annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner pv.kubernetes.io/bound-by-controller: "yes" pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs creationTimestamp: 2018-02-06T15:17:17Z labels: failure-domain.beta.kubernetes.io/region: eu-central-1 failure-domain.beta.kubernetes.io/zone: eu-central-1c name: pvc-a0f6a3ea-ab5a-a1ea-a2ba-a2809894a24a resourceVersion: "125511" selfLink: /api/v1/persistentvolumes/pvc-a0f6a3ea-ab5a-a1ea-a2ba-a2809894a24a uid: a0f6a3ea-ab5a-a1ea-a2ba-a2809894a24a spec: accessModes: - ReadWriteOnce awsElasticBlockStore: fsType: ext4 volumeID: aws://eu-central-1c/vol-a559b271b5fca7dfa capacity: storage: 20Gi claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: mysql-pv-claim namespace: default resourceVersion: "22108" uid: a0f6a3ea-ab5a-a1ea-a2ba-a2809894a24a persistentVolumeReclaimPolicy: Delete storageClassName: gp2 status: phase: Bound
Once that volume has been created you loose the capability to control its encryption status or the ability to change its KMS key.
Why would you want to do such a thing in the first place? Say for example, you created a volume unencrypted, written a lot of application state and now want to encrypt the volume. Or you have encrypted the volumes but now decide to use KMS or CMK and therefore need to change the KMS key of that volume. An simple way would be to create a second volume in Kubernetes and then copy the files over. However, when you do not have access to the applications that are using these volumes, this might not be that easy, specifically if you have hundreds of volumes.
In a pure AWS world, a process exists for this:
- detach the volume (by stopping the instance),
- create a snapshot,
- copy snapshot using the correct encryption method,
- create a volume out of snapshot in the correct parameters (specifically AZ) and
- attach it to the instance.
For volumes that are attached to nodes (not masters), you can stop the instances. But before you do that, suspend the corresponding ASG (Auto Scaling Group) launch event. Otherwise the ASG will start a new instance and probably mount the EBS volume there.
If you terminate the instances (as you would probably should in an immutable world), you need to make sure that the ASG will start the new ones in the same AZ. I had a case with 2 nodes in 3 AZs and sure enough the round robin ASG would start the new instance in the previously unused AZ. The problem is that EBS volumes are AZ-bound and in that case the volume couldn’t be attached to that new instance since they were in 2 different AZs.
However, this will change the volume ID that is tracked by Kubernetes. Here our beloved kubectl patch will come handy:
kubectl patch pv pvc-a0f6a3ea-ab5a-a1ea-a2ba-a2809894a24a -p '{"spec":{"awsElasticBlockStore":{"volumeID":"aws://eu -central-1c/vol-a559b271b5fca7dfa"}}}'
You will also need to copy all the tags to the new volume and modify the tags on the old volumes.
In summary, on Kubernetes I have successfully tried the following procedure:
- gracefully shutdown the persistent application,
- pick a volume to exchange and its corresponding node (pv/volumeID and no/volumesAttached,
- suspend Launch on node ASG,
- terminate the instance (volume becomes available),
- check application pods are pending,
- create a snapshot,
- copy snapshot using the correct encryption method,
- create a volume out of snapshot in the correct parameters (specifically AZ, size),
- copy all tags from original volume (and modify tags on original volume),
- kubectl patch the PVs volumeID,
- resume Launch on node ASG,
- wait for a new instance (check it is in the same AZ as the old one),
- check pods are running and application is working fine.
Only use procedures like this in your experimental environments. I take no responsibility for any damage!