Canarys | IT Services

Blogs

HOW TO DO BACKUP AND RESTORE ENTIRE K8S CLUSTER?

Share

If you have deployed different applications on k8s cluster using various objects like deployments, pods, services etc. all the info about cluster is stored into ETCD cluster. If your application is using persistent storage that will be considered as another backup scenario.

We will be creating different resources either using imperative way or declarative way. The most preferred way is declarative way so that you can save the configuration file for future purpose. A good practice is to store the definition files on source code repository like GitHub , that way it can be maintained by team, with that at times when you cluster got destroyed , you can deploy the cluster with pre created definition files from GitHub.

A better approach in backing up the resource config is to query the kube-apiserver. Querying the kube-apiserver is by using the kubectl or accessing the kube-apiserver directly and save all the resource configuration for all objects created on the cluster as a copy. For example, you can use the following command to take backup

Kubectl get all –all-namespaces -o yaml > backup.yml

This is just for few resource groups. There are many other resources that must be considered. For that we have one tool called VELERO BY heptIO , that can do this for you. It can help in taking backup for k8s cluster using kube-apiserer.

Let’s see now with ETCD

ETCD stores the information about the state of cluster from node to pods. Instead of backing up of resources as before. You may choose to back up the ETCD itself. ETCD cluster is configured on master nodes. While configuring ETCD you specify the location where the data to be stored ie, “data-dir”, that can be configured to be backed up by using backup tool.

ETCD also comes with built in snapshot solution. You can take snapshot of ETCD database using ETCDCTL

Execute the following command in terminal to take backup. we will take options required for us like cacert, cert, key, endpoints by executing the ETCDCTL_API=3 etcdctl snapshot save -h and take above said options and the snapshot save command

ETCDCTL_API=3 etcdctl snapshot save --cacert=/etc/kubernetes/pki/etcd/ca.crt  --cert=/etc/kubernetes/pki/etcd/server.crt --endpoints=127.0.0.1:2379 --key=/etc/kubernetes/pki/etcd/server.key /tmp/snapshot-pre-boot.db

Execute the following command to view the member list of the ETCD cluster. #optional

 
ETCDCTL_API=3 etcdctl member list --cacert=/etc/kubernetes/pki/etcd/ca.crt  --cert=/etc/kubernetes/pki/etcd/server.crt --endpoints=127.0.0.1:2379 --key=/etc/kubernetes/pki/etcd/server.key

If you want to restore from the backup, execute the following commands

Service kube-apiserver stop
	
ETCDCTL_API=3 etcdctl snapshot  restore -h
Copy and fill the options from the output like
--data-dir=”/var/lib/etcd-from-backup”
--initial-advertise-peer-urls=”https://127.0.0.1:2380”
--initial-cluster=”master=https://127.0.0.1:2380”
--initial-cluster-token="etcd-cluster-1”
--name="master"

You have to include options like cacert, cert, key, endpoints, also so the final command looks as below

ETCDCTL_API=3 etcdctl snapshot restore  
--cacert=/etc/kubernetes/pki/etcd/ca.crt  
--cert=/etc/kubernetes/pki/etcd/server.crt 
--endpoints=127.0.0.1:2379   --key=/etc/kubernetes/pki/etcd/server.key   
--data-dir=”/var/lib/etcd-from-backup” 
--initial-advertise-peer-urls=”https://127.0.0.1:2380”  
--initial-cluster=”master=https://127.0.0.1:2380” 
--initial-cluster-token="etcd-cluster-1” --name="master"  /tmp/snapshot-pre-boot.db

We are not done yet; we have to make changes on the ETCD pod definition file

Include the following

 under command you will find this –data-dir and add the value what we have given in command, so option needs to include are;

--data-dir=/var/lib/etcd-from-backup

Add entire option of cluster token from the command –initial-cluster-token=etcd-cluster-1 under same command section in pod definition.

In the same file under volumes section replace the mount path and host path with /var/lib/etcd-from-backup

Then check for ETCD containers up or not for that execute

Docker ps -a | grep etcd
Systemctl daemon-reload
Service etcd restart
Service kube-apiserver start

When ETCD restores from backup, it initializes a new cluster configuration and configures the members of ETCD. This is to prevent new members adding to the cluster accidentally. During a restore you must specify the new cluster token and the same initial cluster configuration options specified in the original config file.

We have seen both types of backup using resource configuration and etcdctl. Both of them have pros and cons. If your using managed k8s like AKS then at that time you may not even have access to ETCD cluster, in that way backup by querying the kube-apiserver probably the better way

Leave a Reply

Your email address will not be published. Required fields are marked *

Reach Us

With Canarys,
Let’s Plan. Grow. Strive. Succeed.