Nebulaworks Insight Content Card Background - Ricardo gomez angel building facade
When it comes to managing workloads in a cluster, Kubernetes is often the tool of choice, with its open-source nature and ever-expanding user base. Being a container orchestrator, it solves the issue of micromanaging numerous ephemeral containers that often host various parts of an application, grouped together via Pods. Each of these containers has its own independent storage and life cycles. Due to this distinction from running traditional virtual machines (VMs), new challenges are presented in these applications. One such challenge is file storage. To resolve this issue, Kubernetes has the concept of volumes, which allows for pods to have permanent storage space. In this blog, we will be taking a quick look at leveraging AWS EBS for Kubernetes Persistent Volumes.
Kubernetes Persistent Volumes
As of this blog, there are two different categories of volumes that exist in Kubernetes, normal volumes, and
persistent volumes. Persistent volumes
come with the added luxury of being independent of the pod they are attached to, making them completely independent from
the pod’s life cycle. Not only that, but they are more flexible than the standard volume, such as having user-specified
sizes and performance needs. Kubernetes volumes also come with the nice perk of having a multitude of different types of
them to fit a user’s need. One such type of persistent volume is the
AWSElasticBlockStore which is the type
this blog will focus on.
Why go to the cloud?
Great question! It may be a bold move to suddenly trust a third-party developer to store your cluster’s data, especially if it contains confidential data. However, this decision has a lot of merit to it, despite the initial rebound. By utilizing another service, the cluster’s infrastructure has been greatly simplified. As we will be seeing shortly, connecting a cloud provider’s volume into your cluster is fairly straightforward. Not only that, but it will also cut costs on maintaining an in-house server that would host said solution. What’s more, a cloud provider has built-in reliability, security, and high availability that they take care of in the background. All the end-user will need to worry about is utilizing said service in their applications. This separation of operations will prove its weight in gold in the long run.
Now that we addressed the
Why?, let’s do a quick dive into the
To properly utilize a cloud provider’s storage for persistent volumes, one must have the following:
- A working Kubernetes cluster that is hosted on AWS. This can either be done on EC2 instances (which was what this
blog post was written with in mind) or using AWS EKS service. The cluster also needs to have the flag
--cloud-provider=awsenabled on the kubelet, api-server, and the controller-manager during the cluster’s creation. One way to incorporate this flag is by using
kubeadm init --config config.yamlwhen creating a new cluster. An example of what is in a
apiVersion: kubeadm.k8s.io/v1beta1 kind: ClusterConfiguration apiServer: extraArgs: cloud-provider: aws controllerManager: extraArgs: cloud-provider: aws address: 0.0.0.0 networking: podSubnet: <the-value-you-put-for-the-pod-address-cidr-flag> scheduler: extraArgs: address: 0.0.0.0 --- apiVersion: kubeadm.k8s.io/v1beta1 kind: InitConfiguration nodeRegistration: kubeletExtraArgs: cloud-provider: aws
For best practice, it is reccomended to have your cluster hosted in the same environment that your volumes will reside in. Otherwise, you will run into issues concerning data transfer/upload speeds.
- The instances in the cluster have their hostname to be the same as their private DNS entry. The quickest way to get this done is by doing the following command on your EC2 instances
sudo sed -i "s/$(hostname)/$(curl http://169.254.169.254/latest/meta-data/hostname)/g" /etc/hosts sudo sed -i "s/$(hostname)/$(curl http://169.254.169.254/latest/meta-data/hostname)/g" /etc/hostname sudo reboot
- Create the AWS Elastic Block Store (EBS) volume in the same region as your cluster. If you have the
aws cliinstalled and configured, this command will create one for you:
aws ec2 create-volume --availability-zone=eu-west-1a --size=10 --volume-type=gp2
- With this new volume, attach it onto the master node in your cluster. If you have the
aws cliinstalled and configured, this command will perform this for you:
aws ec2 attach-volume --device /dev/xvdf --instance-id <MASTER NODE ID> --volume-id <YOUR VOLUME ID>
- In the master node, check to see if your device is attached to your instance by running
lsblk. If the last step worked, you should see your volume at the bottom of the list. In this case, the volume I made earlier is called
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT loop0 7:0 0 17.9M 1 loop /snap/amazon-ssm-agent/1068 loop1 7:1 0 89.3M 1 loop /snap/core/6673 nvme0n1 259:0 0 25G 0 disk └─nvme0n1p1 259:1 0 25G 0 part / nvme1n1 259:2 0 10G 0 disk
- With the name of the volume, create the filesystem on the volume. This only needs to be done once on the volume.
sudo mkfs -t xfs /dev/<NAME OF VOLUME FROM PREV STEP>
- Create a
Persistent Volumethat associates the EBS you made to the cluster. An example of said volume looks like this:
apiVersion: v1 kind: PersistentVolume metadata: name: aws-pv labels: type: aws-pv spec: capacity: storage: 3Gi accessModes: - ReadWriteOnce awsElasticBlockStore: volumeID: <YOUR EBS VOLUME ID HERE> fsType: xfs
- Create the
Persistent Volume Claimthat will take a partition of the
Persistent Volumewe just made. An example of said claim would look like is:
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: aws-pvc labels: type: aws-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 3Gi selector: matchLabels: type: <THE NAME OF THE PV YOU MADE EARLIER>
- Create a
Podthat takes in the
Persistent Volume Claimwe just made and mounts it into the Pod. An example of said pod looks like this:
apiVersion: v1 kind: Pod metadata: name: redis-cloud spec: volumes: - name: cloud-storage persistentVolumeClaim: claimName: <NAME OF CLAIM YOU MADE EARLIER> containers: - name: redis image: redis volumeMounts: - name: cloud-storage mountPath: /cloud/data
- Run the following
kubectlcommands on your cluster:
kubectl create -f pv.yaml kubectl create -f pvc.yaml
To verify that your volume and claim are associated, run
kubectl get pvc and look for the name of your PVC that you
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE aws-pvc Bound aws-pv 3Gi RWO 3s
status of it says
BOUND, everything is working!
With you PVC bound to the PV, now run:
kubectl create -f redis-cloud.yaml
Once it is up, verify to see if the volume has been properly mounted onto the pod by doing:
kubectl describe pod redis-cloud. If the
Eventssection looks like the following, the volume mounted successfully!
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 17s default-scheduler Successfully assigned default/redis-cloud-2 to ip-172-31-23-218.us-west-1.compute.internal Normal SuccessfulAttachVolume 15s attachdetach-controller AttachVolume.Attach succeeded for volume "aws-pv" Normal Pulling 7s kubelet, ip-172-31-23-218.us-west-1.compute.internal Pulling image "redis" Normal Pulled 2s kubelet, ip-172-31-23-218.us-west-1.compute.internal Successfully pulled image "redis" Normal Created 2s kubelet, ip-172-31-23-218.us-west-1.compute.internal Created container redis Normal Started 2s kubelet, ip-172-31-23-218.us-west-1.compute.internal Started container redis
Perform a local exec into the pod, using
kubectl exec -it nameOfPod -- /bin/bashand verify that the volume is at the mount point that we specified (in this case, it should be at
You’re done! Feel free to add files to that directory. Even if the pod is deleted, when the pod is respun up, whether it is the same exact yaml that we provided or if it is a brand new pod, that file should still be in there.
NOTE: As of this blog post, the EBS volume integration with Kubernetes PV will only work on one node at a time. This means that two nodes cannot mount the same EBS volume at once. Thus, when making deployments using PVs that are backed by EBS, be sure to properly allocate the pods being located on the instance that has the volume attached to it.
But what about Dynamic Storage Provisioning?
Another good question! One of the downfalls of using the method above is that an operator needs to create the storage
resource itself on a cloud provider and then link it to a
Persistent Volume. Once that is done, the developer can then
Persistent Volume Claim to use said deployed PV. However, there is a way for storage resources on the fly
by the use of
Storage Classes. This works by that the
Storage Class, will provision the needed storage resource onto
the cloud, using the specified
provisioner. In order for these to work, the cluster must have the proper
permissions granted to them in order to deploy the proper resources.
Storage Class objects are declared like the following (note that this is the format for a Storage Class
utilizing EBS. For more detains on other cloud providers, refer to
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: ebs-storage-class provisioner: kubernetes.io/aws-ebs parameters: type: io1 iopsPerGB: '10' fsType: xfs
sc.yaml into the cluster, all an operator needs to do when it comes to provisioning volumes for their
developers is creating a
Persistent Volume that has the additional parameter to it:
apiVersion: v1 kind: PersistentVolume metadata: name: aws-pv-sc labels: type: sc spec: capacity: storage: 5Gi accessModes: - ReadWriteOnce storageClassName: ebs-storage-class # NEW PARAMETER
Then, a developer who needs to utilize a
Persistent Volume creates and deploys the following
Persistent Volume Claim
for their own use:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: aws-pvc-sc spec: accessModes: - ReadWriteOnce resources: requests: storage: 3Gi storageClassName: ebs-storage-class # NEW PARAMETER selector: matchLabels: type: sc
Volumes are what makes applications running in Kubernetes pods much more reliable in usability. No longer does the operations need to be concerned with making sure the data is safe from deletion or loss. By leveraging cloud providers, like AWS, in connecting to your Kubernetes persistent volumes, the cluster will continue to stay reliable in performance as well as operating.