AKS Auto scaling
We have two properties in AKS autoscaling
Cluster Autoscaler
Horizontal Pod Autoscaler
Cluster Autoscaler
If pod demand changes, he Kubernetes cluster Autoscaler adjusts the number of nodes based on the requested compute resources in the node pool.
#create a resource group
az group create --name myResourceGroup --location eastus
# create a AKS cluster with autoscaler enabled
az aks create \
--resource-group myResourceGroup \
--name myAKSCluster \
--node-count 1 \ # 3 is recomended for production grade cluster
--vm-set-type VirtualMachineScaleSets \
--load-balancer-sku standard \
--enable-cluster-autoscaler \ #here we are enabling cluster Autoscaler
--min-count 1 \
--max-count 3
# updating a existing cluster with cluster Autoscaler enabled
az aks update \
--resource-group myResourceGroup \
--name myAKSCluster \
--enable-cluster-autoscaler \
--min-count 1 \
--max-count 3
in portal, go to your aks-cluster > settings > Node pools > select Autoscale
Notes:
By default, the cluster Autoscaler checks the Metrics API server every 10 seconds for any required changes in node count.
The cluster Autoscaler works with Kubernetes RBAC-enabled AKS clusters that run Kubernetes 1.10.x or higher.
cluster Autoscaler is typically used alongside the horizontal pod autoscaler. When combined, the horizontal pod autoscaler increases or decreases the number of pods based on application demand, and the cluster autoscaler adjusts the number of nodes to run more pods.
Horizontal Pod Autoscaler
We can set a target CPU utilization % and HPA scales in or out as per to meet the requirement.
HPA need Kubernetes metrics server to verify the CPU metrics of a pod.
1) HPA checks data from metrics server in every 15 seconds
2) HPA Calculating the replicas
3) HPA asks to scale the Application replicas in or out
HPA requirements:
metrics data
CPU %
min replicas
max replicas
kubectl autoscale deployment mydeployment --cpu-percentage --min=2 --max=10
HPA manifest
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: store-front-hpa
spec:
maxReplicas: 10 # define max replica count
minReplicas: 3 # define min replica count
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: store-front # this should be your deployment name
metrics: 50 # target CPU utilization
##############################################################
kubectl describe hpa/store-front-hpa
Notes:
By default, the HPA checks the Metrics API in every 15 seconds for any required changes in replica count, and the Metrics API retrieves data from the Kubelet every 60 seconds.