Advanced Autoscaling on EKS

Autoscaling Options

TypeWhat It ScalesMetric Source
HPAPodsCustom metrics
VPAPod resourcesResource usage
KEDAPods + Workers50+ external sources
Cluster Proportional VPANodesPod count
KarpenterNodesPending pods

Horizontal Pod Autoscaler (HPA)

Custom Metrics HPA

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "100"
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60

HPA with Prometheus Metrics

# Install Prometheus adapter
helm install prometheus-adapter prometheus-community/prometheus-adapter \
  --namespace monitoring \
  --set prometheus.url=http://prometheus-server \
  --set prometheus.port=9090

Vertical Pod Autoscaler (VPA)

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: app
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2
        memory: 4Gi
      controlledResources: ["cpu", "memory"]

KEDA

Install KEDA

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
 
helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace

ScaledObject with Prometheus

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: my-app-scaler
spec:
  scaleTargetRef:
    name: my-app
  minReplicaCount: 2
  maxReplicaCount: 20
  cooldownPeriod: 300
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus:9090
      metricName: http_requests_total
      threshold: "100"
      query: sum(rate(http_requests_total[2m]))

KEDA Scalers

ScalerUse Case
prometheusMetrics-based scaling
mysqlDatabase connection pool
redisQueue length
aws-sqs-queueSQS message count
kafkaTopic lag
cronTime-based scaling
rabbitmqQueue depth

Cluster Proportional Autoscaler

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-proportional-vertical-autoscaler
  namespace: kube-system
data:
  config: |
    {
      "clusterName": "my-cluster",
      "minReplicaCount": 1,
      "maxReplicaCount": 10,
      "metricsServer": {
        "port": 8080
      },
      "resources": [
        {
          ".namespace": "ingress-nginx",
          "controller": "deployment/ingress-nginx-controller",
          "containerName": "controller",
          "resources": {
            "requests": {
              "cpu": "100m",
              "memory": "128Mi"
            }
          }
        }
      ]
    }

References