Kubernetes Cluster

Kubernetes Cluster Setup for Akash Providers

Overview

Akash leases are deployed via Kubernetes pods on provider clusters. This guide details the build of the provider’s Kubernetes control plane and worker nodes.
The setup of a Kubernetes cluster is the responsibility of the provider. This guide provides best practices and recommendations for setting up a Kubernetes cluster. This document is not a comprehensive guide and assumes pre-existing Kubernetes knowledge.

Quickstart Guides

Create a Kubernetes cluster and start your first provider
Already have a Kubernetes cluster? Start here!

STEP 1 - Clone the Kubespray Project

Cluster Creation Recommendations

We recommend using the Kubespray project to deploy a cluster. Kubespray uses Ansible to make the deployment of a Kubernetes cluster easy.
The recommended minimum number of hosts is four for a production Provider Kubernetes cluster. This is meant to allow:
  • Three hosts serving as a redundant control plane/master instances
  • One host to serve as Kubernetes worker node to host provider leases
  • NOTE - the number of control plane nodes in the cluster should always be an odd number to allow the cluster to reach consensus.
While you could use a single Kubernetes host in testing and dev this would not be recommended for production.

Kubernetes Cluster Hardware Requirements and Recommendations

Kubernetes Master Node Requirements
  • Minimum Specs
    • 2 CPU
    • 4 GB RAM
    • 30 GB disk
  • Recommended Specs
    • 4 CPU
    • 8 GB RAM
    • 40 GB disk
Kubernetes Work Node Requirements
  • Minimum Specs
    • 4 CPU
    • 8 GB
    • 100 GB disk
  • Recommendations
    • The more resources the better depending on your goal of maximum number of concurrent deployments.
    • Especially important to note that worker node needs to have as much CPU as possible, because if it's got, say 8 CPU and, 100 GB RAM, and 2 TB disk -> the cpu would likely be a bottleneck. Since people tend to deploy at least 1 CPU per deployment, the server could only host 8 deployments maximum and likely about 6 deployments as other ~2 CPU will be reserved by the Kubernetes system components.

Kubespray Clone

Install Kubespray on a machine that has connectivity to the three hosts that will serve as the Kubernetes cluster.
Obtain Kubespray and navigate into the created local directory:
1
git clone https://github.com/kubernetes-sigs/kubespray.git ; cd kubespray
Copied!

STEP 2 - Install Ansible

When you launch Kubespray it will use an Ansible playbook to deploy a Kubernetes cluster. In this step we will install Ansible.
Depending on your operating system it may be necessary to install OS patches, pip3, and virtualenv. Example steps for a Ubuntu OS are detailed below.
1
apt-get update ; apt-get install -y python3-pip virtualenv
Copied!
Within the kubespray directory use the following commands for the purpose of:
  • Opening a Python virtual environment for the Ansible install
  • Installing Ansible and other necessary packages specified in the requirements.txt file
1
virtualenv --python=python3 venv
2
3
source venv/bin/activate
4
5
rm requirements.txt
6
7
cat > requirements.txt << 'EOF'
8
ansible==4.8.0
9
ansible-core==2.11.6
10
cryptography==2.8
11
jinja2==2.11.3
12
netaddr==0.7.19
13
pbr==5.4.4
14
jmespath==0.9.5
15
ruamel.yaml==0.16.10
16
ruamel.yaml.clib==0.2.6
17
MarkupSafe==1.1.1
18
EOF
19
20
pip3 install -r requirements.txt
Copied!

STEP 3 - Ansible Access to Kubernetes Cluster

Ansible will configure the Kubernetes hosts via SSH. The user Ansible connects with must be root or have the capability of escalating privileges to root.
Commands in this step provide an example set up of SSH access to Kubernetes hosts and testing those connections.

Create SSH Keys on Ansible Host

1
ssh-keygen -t rsa -C $(hostname) -f "$HOME/.ssh/id_rsa" -P "" ; cat ~/.ssh/id_rsa.pub
Copied!

Confirm SSH Keys

  • The keys will be stored in the user’s home directory.
  • Use these commands to verify keys.
1
cd ~/.ssh ; ls
Copied!
  • Files created:
1
authorized_keys id_rsa id_rsa.pub
Copied!

Copy Public Key to all of the Kubernetes Hosts

  • Template:
1
ssh-copy-id -i ~/.ssh/id_rsa.pub <username>@<ip-address>
Copied!
  • Example:
1
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
2
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
3
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
4
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
Copied!

Confirm SSH to the Kubernetes Hosts

  • Ansible should be able to access Kubernetes hosts with no password
  • Template:
1
ssh -i ~/.ssh/id_rsa <username>@<ip-address>
Copied!
  • Example:
1
ssh -i ~/.ssh/id_rsa [email protected]
2
ssh -i ~/.ssh/id_rsa [email protected]
3
ssh -i ~/.ssh/id_rsa [email protected]
4
ssh -i ~/.ssh/id_rsa [email protected]
Copied!

STEP 4 - Ansible Inventory

Ansible will use an inventory file to determine the hosts Kubernetes should be installed on.

Inventory File

  • Use the following commands on the Ansible host and in the “kubespray” directory
  • Replace the IP addresses in the declare command with the addresses of your Kubernetes hosts
  • Running these commands will create a hosts.yaml file within the kubespray/inventory/akash directory
1
cp -rfp inventory/sample inventory/akash
2
3
declare -a IPS=(10.0.10.136 10.0.10.239 10.0.10.253 10.0.10.9)
4
5
CONFIG_FILE=inventory/akash/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
Copied!
  • Expected result:
1
(venv) [email protected]:/home/ubuntu/kubespray# CONFIG_FILE=inventory/akash/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
2
3
DEBUG: Adding group all
4
DEBUG: Adding group kube_control_plane
5
DEBUG: Adding group kube_node
6
DEBUG: Adding group etcd
7
DEBUG: Adding group k8s_cluster
8
DEBUG: Adding group calico_rr
9
DEBUG: adding host node1 to group all
10
DEBUG: adding host node2 to group all
11
DEBUG: adding host node3 to group all
12
DEBUG: adding host node4 to group all
13
DEBUG: adding host node1 to group etcd
14
DEBUG: adding host node2 to group etcd
15
DEBUG: adding host node3 to group etcd
16
DEBUG: adding host node1 to group kube_control_plane
17
DEBUG: adding host node2 to group kube_control_plane
18
DEBUG: adding host node3 to group kube_control_plane
19
DEBUG: adding host node1 to group kube_node
20
DEBUG: adding host node2 to group kube_node
21
DEBUG: adding host node3 to group kube_node
22
DEBUG: adding host node4 to group kube_node
Copied!
  • Example of the generated hosts.yaml file
  • Update the kube_control_plane category if needed with full list of hosts that should be master nodes
1
all:
2
hosts:
3
node1:
4
ansible_host: 10.0.10.136
5
ip: 10.0.10.136
6
access_ip: 10.0.10.136
7
node2:
8
ansible_host: 10.0.10.239
9
ip: 10.0.10.239
10
access_ip: 10.0.10.239
11
node3:
12
ansible_host: 10.0.10.253
13
ip: 10.0.10.253
14
access_ip: 10.0.10.253
15
node4:
16
ansible_host: 10.0.10.9
17
ip: 10.0.10.9
18
access_ip: 10.0.10.9
19
children:
20
kube_control_plane:
21
hosts:
22
node1:
23
node2:
24
node3:
25
kube_node:
26
hosts:
27
node1:
28
node2:
29
node3:
30
node4:
31
etcd:
32
hosts:
33
node1:
34
node2:
35
node3:
36
k8s_cluster:
37
children:
38
kube_control_plane:
39
kube_node:
40
calico_rr:
41
hosts: {}
Copied!

Manual Edits/Insertions of the hosts.yaml Inventory File

  • Open the hosts.yaml file in VI (Visual Editor) or nano to make the following file updates to the hosts.yaml file
  • Within the YAML file’s “all” stanza and prior to the “hosts” sub-stanza level - insert the following vars stanza
1
vars:
2
cluster_id: "1.0.0.1"
3
ansible_user: root
4
gvisor_enabled: true
Copied!
  • The hosts.yaml file should look like this once finished
1
all:
2
vars:
3
cluster_id: "1.0.0.1"
4
ansible_user: root
5
gvisor_enabled: true
6
hosts:
7
node1:
8
ansible_host: 10.0.10.136
9
ip: 10.0.10.136
10
access_ip: 10.0.10.136
11
node2:
12
ansible_host: 10.0.10.239
13
ip: 10.0.10.239
14
access_ip: 10.0.10.239
15
node3:
16
ansible_host: 10.0.10.253
17
ip: 10.0.10.253
18
access_ip: 10.0.10.253
19
node4:
20
ansible_host: 10.0.10.9
21
ip: 10.0.10.9
22
access_ip: 10.0.10.9
23
children:
24
kube_control_plane:
25
hosts:
26
node1:
27
node2:
28
node3:
29
kube_node:
30
hosts:
31
node1:
32
node2:
33
node3:
34
node4:
35
etcd:
36
hosts:
37
node1:
38
node2:
39
node3:
40
k8s_cluster:
41
children:
42
kube_control_plane:
43
kube_node:
44
calico_rr:
45
hosts: {}
Copied!

STEP 5 - Enable gVisor

In this section we will enable gVisor which provides basic container security.
  • From the “kubespray” directory:
1
cd inventory/akash/group_vars/k8s_cluster
Copied!
  • Using VI or nano edit the k8s-cluster.yml file:
1
vi k8s-cluster.yml
Copied!
  • Update the container_manager key if necessary to containerd
1
container_manager: containerd
Copied!
  • From the “kubespray” directory:
1
cd inventory/akash/group_vars
Copied!
  • Using VI or nano edit the etcd.yml file:
1
vi etcd.yml
Copied!
  • Update the etcd_deployment_type key if necessary to host
1
etcd_deployment_type: host
Copied!

gVisor Issue - No system-cgroup v2 Support

If you are using a newer systemd version, your container will get stuck in ContainerCreating state on your provider with gVisor enabled. Please reference this document for details regarding this issue and the recommended workaround.

STEP 6 - Create Kubernetes Cluster

With inventory in place we are ready to build the Kubernetes cluster via Ansible.
  • Note - the cluster creation may take several minutes to complete
  • From the “kubespray” directory:
1
ansible-playbook -i inventory/akash/hosts.yaml -b -v --private-key=~/.ssh/id_rsa cluster.yml
Copied!

STEP 7 - Confirm Kubernetes Cluster

A couple of quick Kubernetes cluster checks are in order before moving into next steps.
  • SSH into Kubernetes node01 (AKA Kubernetes master node)

Confirm Kubernetes Nodes

1
kubectl get nodes
Copied!
  • Example output from a healthy Kubernetes cluster:
1
NAME STATUS ROLES AGE VERSION
2
node1 Ready control-plane,master 8m47s v1.22.4
3
node2 Ready control-plane,master 8m17s v1.22.4
4
node3 Ready control-plane,master 8m17s v1.22.4
5
node4 Ready <none> 7m11s v1.22.4
Copied!

Confirm Kubernetes Pods

1
kubectl get pods -n kube-system
Copied!
  • Example output of the pods that are the brains of the cluster
1
[email protected]:/home/ubuntu# kubectl get pods -n kube-system
2
3
NAME READY STATUS RESTARTS AGE
4
calico-kube-controllers-5788f6558-mzm64 1/1 Running 1 (4m53s ago) 4m54s
5
calico-node-2g4pr 1/1 Running 0 5m29s
6
calico-node-6hrj4 1/1 Running 0 5m29s
7
calico-node-9dqc4 1/1 Running 0 5m29s
8
calico-node-zt8ls 1/1 Running 0 5m29s
9
coredns-8474476ff8-9sgm5 1/1 Running 0 4m32s
10
coredns-8474476ff8-x67xd 1/1 Running 0 4m27s
11
dns-autoscaler-5ffdc7f89d-lnpmm 1/1 Running 0 4m28s
12
kube-apiserver-node1 1/1 Running 1 7m30s
13
kube-apiserver-node2 1/1 Running 1 7m13s
14
kube-apiserver-node3 1/1 Running 1 7m3s
15
kube-controller-manager-node1 1/1 Running 1 7m30s
16
kube-controller-manager-node2 1/1 Running 1 7m13s
17
kube-controller-manager-node3 1/1 Running 1 7m3s
18
kube-proxy-75s7d 1/1 Running 0 5m56s
19
kube-proxy-kpxtm 1/1 Running 0 5m56s
20
kube-proxy-stgwd 1/1 Running 0 5m56s
21
kube-proxy-vndvs 1/1 Running 0 5m56s
22
kube-scheduler-node1 1/1 Running 1 7m37s
23
kube-scheduler-node2 1/1 Running 1 7m13s
24
kube-scheduler-node3 1/1 Running 1 7m3s
25
nginx-proxy-node4 1/1 Running 0 5m58s
26
nodelocaldns-7znkj 1/1 Running 0 4m28s
27
nodelocaldns-g8dqm 1/1 Running 0 4m27s
28
nodelocaldns-gf58m 1/1 Running 0 4m28s
29
nodelocaldns-n88fj 1/1 Running 0 4m28s
Copied!

STEP 8 - Custom Resource Definition

Akash uses two Kubernetes Custom Resource Definitions (CRD) to store each deployment.
  • On the Kubernetes master node, download and install the Akash CRD files.

Download install CRD File

1
wget https://raw.githubusercontent.com/ovrclk/akash/master/pkg/apis/akash.network/crd.yaml
2
3
kubectl apply -f ./crd.yaml
Copied!

Confirm the CRD installs

1
kubectl get crd | grep akash
Copied!

Expected CRD Output

1
# kubectl get crd | grep akash
2
3
manifests.akash.network 2022-04-27T14:39:32Z
4
providerhosts.akash.network 2022-04-27T14:39:47Z
Copied!

STEP 9 - Network Policy And Akash Namespace

Network Policy Configuration

On the Kubernetes master node, download the network YAML file
1
wget https://raw.githubusercontent.com/ovrclk/akash/mainnet/main/_docs/kustomize/networking/network-policy-default-ns-deny.yaml
Copied!
  • Install the YAML File
1
kubectl apply -f ./network-policy-default-ns-deny.yaml
Copied!

Namespace Addition

  • Apply the Akash namespace YAML file and other networking customizations
1
git clone --depth 1 -b mainnet/main https://github.com/ovrclk/akash.git
2
cd akash
3
kubectl apply -f _docs/kustomize/networking/namespace.yaml
4
kubectl kustomize _docs/kustomize/akash-services/ | kubectl apply -f -
5
6
cat >> _docs/kustomize/akash-hostname-operator/kustomization.yaml <<'EOF'
7
images:
8
- name: ghcr.io/ovrclk/akash:stable
9
newName: ghcr.io/ovrclk/akash
10
newTag: 0.14.1
11
EOF
12
13
kubectl kustomize _docs/kustomize/akash-hostname-operator | kubectl apply -f -
Copied!

STEP 10 - Ingress Controller

The Akash provider requires an ingress controller in the Kubernetes cluster.

Ingress Controller Install

  • On the Kubernetes master node, download and install the ingress controller YAML files
1
wget https://raw.githubusercontent.com/ovrclk/akash/v0.14.1/_run/ingress-nginx-class.yaml
2
kubectl apply -f ./ingress-nginx-class.yaml
3
4
wget https://raw.githubusercontent.com/ovrclk/akash/v0.14.1/_run/ingress-nginx.yaml
5
kubectl apply -f ./ingress-nginx.yaml
Copied!
  • Expected result:
1
[email protected]:~# kubectl apply -f ./ingress-nginx.yaml
2
3
namespace/ingress-nginx created
4
serviceaccount/ingress-nginx created
5
configmap/ingress-nginx-controller created
6
configmap/ingress-nginx-tcp created
7
clusterrole.rbac.authorization.k8s.io/ingress-nginx created
8
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created
9
role.rbac.authorization.k8s.io/ingress-nginx created
10
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
11
service/ingress-nginx-controller-admission created
12
service/ingress-nginx-controller created
13
deployment.apps/ingress-nginx-controller created
14
ingressclass.networking.k8s.io/nginx created
15
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
16
serviceaccount/ingress-nginx-admission created
17
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created
18
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
19
role.rbac.authorization.k8s.io/ingress-nginx-admission created
20
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
21
job.batch/ingress-nginx-admission-create created
22
job.batch/ingress-nginx-admission-patch created
Copied!

Ingress Controller Configuration

A Kubernetes node needs to be labeled for ingress use. This will cause the NGINX ingress controller to live only on the labeled node.
NOTE - if a wildcard domain is created, the pointers should point to the labeled node's IP address. Additional nodes can be labeled to load balance any ingress communications.
1
kubectl label nodes node4 akash.network/role=ingress
Copied!

STEP 11 - Disable Swap on Kubernetes Hosts

The Kubelet process on Kubernetes worker nodes does not support swap. Issue the following command on each worker node to disable swap.
1
swapoff -a
Copied!
We should in addition ensure that swap is permanently disabled on the Linux hosts via:
  • Open the /etc/fstab file
  • Search for a swap line and add a # (hashtag) sign in front of the line to comment out the entire line
1
vi /etc/fstab
Copied!

STEP 12 - Review Firewall Policies

External/Internet Firewall Rules

The following firewall rules are applicable to internet facing Kubernetes components.
Akash Provider
1
8443/tcp - for manifest uploads
Copied!
Akash Ingress Controller
1
80/tcp - for web app deployments
2
443/tcp - for web app deployments
3
30000-32767/tcp - for Kubernetes node port range for deployments
4
30000-32767/udp - for Kubernetes node port range for deployments
Copied!

Internal (LAN) Firewall Rules

If local firewall instances are running on Kubernetes control-plane and worker nodes, add the following policies.
Etcd Key Value Store Policies
Ensure the following ports are open inbound on all Kubernetes etcd instances:
1
- 2379/tcp for client requests; (Kubernetes control plane to etcd)
2
- 2380/tcp for peer communication; (etcd to etcd communication)
Copied!
API Server Policies
Ensure the following ports are open inbound on all Kubernetes API server instances:
1
- 6443/tcp - Kubernetes API server
Copied!
Worker Node Policies
Ensure the following ports are open inbound on all Kubernetes worker nodes:
1
- 10250/tcp - Kubelet API server; (Kubernetes control plane to kubelet)
Copied!