AWS Kubernetes Orchestration with EKS

Introduction to EKS Orchestration

Amazon Elastic Kubernetes Service (EKS) is AWS's managed Kubernetes platform that orchestrates containerized applications through a complete control plane (API Server, Scheduler, Controller Manager) and worker nodes (EC2 or Fargate). The architecture manages workloads via Deployments for stateless apps and StatefulSets for stateful services, deployed as Pods running application Containers. Traffic flows through Services and ALB Ingress controllers, secured by IAM roles and VPC networking. Persistent storage options include EBS volumes and EFS filesystems, while observability is provided through CloudWatch metrics and Prometheus monitoring. The platform integrates with Secrets Manager for credentials and supports hybrid architectures with both EC2 and serverless Fargate compute options, offering a fully managed solution for microservices, batch processing, and machine learning workloads.

EKS automates Kubernetes cluster management, enabling robust cloud-native applications with AWS integrations.

Use Cases

EKS supports a variety of cloud-native workloads:

Microservices: Deploy decoupled services with Deployments and Ingress for scalable APIs.
Machine Learning: Orchestrate ML pipelines with Kubeflow on EKS for training and inference.
Stateful Applications: Run databases like PostgreSQL or MongoDB using StatefulSets with EBS.
Batch Processing: Manage data processing jobs with Kubernetes Jobs and CronJobs.
Hybrid Deployments: Use EKS Anywhere for consistent Kubernetes across on-premises and AWS.

EKS’s flexibility supports diverse workloads with tailored orchestration strategies.

EKS Architecture Diagram

The diagram illustrates an Amazon EKS cluster architecture with the following components:

Core Components

Control Plane: Fully managed by AWS, includes API Server, Scheduler, and Controller Manager
Worker Nodes: Runs in customer VPC as either EC2 instances or Fargate pods
Deployments: Manage stateless applications via ReplicaSets
StatefulSets: Manage stateful applications with persistent storage
Pods: Smallest deployable units running one or more Containers

Networking

Services: Internal load balancing (ClusterIP) and external exposure (NodePort/LoadBalancer)
ALB Ingress: AWS Application Load Balancer for HTTP traffic routing
VPC: Provides network isolation and security groups

AWS Integrations

IAM: Provides fine-grained access control via IRSA (IAM Roles for Service Accounts)
EBS/EFS: Block storage (EBS) and shared filesystem (EFS) for persistent volumes
CloudWatch: Centralized monitoring and logging
Secrets Manager: Secure credential storage and rotation

Key Features

Multi-tenant architecture with namespace isolation
Both EC2 and Fargate compute options
Integration with AWS security services (IAM, Secrets Manager)
End-to-end observability (CloudWatch, Prometheus)
High availability through multiple availability zones

graph TD %% Styling (unchanged) classDef client fill:#ffeb3b,stroke:#fbc02d; classDef ingress fill:#ff6f61,stroke:#e53935; classDef service fill:#ff6f61,stroke:#e53935; classDef deployment fill:#405de6,stroke:#1e88e5; classDef pod fill:#405de6,stroke:#1e88e5; classDef container fill:#2ecc71,stroke:#27ae60; classDef control fill:#ff6f61,stroke:#e53935; classDef node fill:#405de6,stroke:#1e88e5; classDef aws fill:#9b59b6,stroke:#8e44ad; classDef storage fill:#2ecc71,stroke:#27ae60; classDef fargate fill:#3498db,stroke:#2980b9; %% Nodes (unchanged) A[Client] B[ALB Ingress] C[Kubernetes Service] D[Deployment: App] E[StatefulSet: DB] F[Pod: App] G[Pod: MongoDB] H[(Container: App)] I[(Container: DB)] J[(EBS Volume)] K[(S3 Bucket)] L[EKS Control Plane] M[API Server] N[Scheduler] O[Controller Manager] P[EC2 Worker Node] Q[Fargate Pod] R[(IAM)] S[(VPC)] T[(CloudWatch)] U[(Prometheus)] V[(EFS)] W[(Secrets Manager)] X[etcd] %% Connections (updated API Server relationships) A -->|HTTPS| B B -->|Routes| C C -->|Exposes| D C -->|Exposes| E D -->|Manages| F E -->|Manages| G F -->|Runs| H G -->|Runs| I G -->|Stores| J G -->|Backup| K L -->|Contains| M L -->|Contains| N L -->|Contains| O M -->|Stores state in| X M -->|Orchestrates| D M -->|Orchestrates| E N -->|Schedules| F N -->|Schedules| G O -->|Reconciles| D O -->|Reconciles| E P -->|Hosts| F P -->|Hosts| G Q -->|Serverless| F R -->|IRSA| F R -->|IRSA| G S -->|Networking| P S -->|Networking| Q T -->|Monitoring| L T -->|Monitoring| P U -->|Scrapes| F U -->|Scrapes| G V -->|Shared FS| F W -->|Secrets| G %% Styles (unchanged) class A client; class B ingress; class C service; class D,E deployment; class F,G pod; class H,I container; class L control; class M,N,O control; class P node; class Q fargate; class J,K storage; class R,S,T,U,V,W aws; class X control; %% Link Styles (unchanged) linkStyle 0 stroke:#ffeb3b,stroke-width:2px,stroke-dasharray:5; linkStyle 1,2,3 stroke:#ff6f61,stroke-width:2px; linkStyle 4,5 stroke:#405de6,stroke-width:2px; linkStyle 6,7 stroke:#2ecc71,stroke-width:2px; linkStyle 8 stroke:#2ecc71,stroke-width:2px; linkStyle 9 stroke:#9b59b6,stroke-width:2px; linkStyle 10,11,12 stroke:#ff6f61,stroke-width:2px; linkStyle 13 stroke:#ff6f61,stroke-width:2px; linkStyle 14,15 stroke:#ff6f61,stroke-width:2px; linkStyle 16,17 stroke:#ff6f61,stroke-width:2px; linkStyle 18,19 stroke:#ff6f61,stroke-width:2px; linkStyle 20,21 stroke:#405de6,stroke-width:2px; linkStyle 22 stroke:#3498db,stroke-width:2px; linkStyle 23,24 stroke:#9b59b6,stroke-width:2px; linkStyle 25,26 stroke:#9b59b6,stroke-width:2px; linkStyle 27,28 stroke:#9b59b6,stroke-width:2px; linkStyle 29,30 stroke:#9b59b6,stroke-width:2px; linkStyle 31 stroke:#9b59b6,stroke-width:2px; linkStyle 32 stroke:#9b59b6,stroke-width:2px;

EKS orchestrates stateful and stateless workloads with robust AWS integrations for networking, security, and storage.

Key EKS Components

EKS leverages Kubernetes components with AWS-specific integrations:

Pods: Smallest deployable units hosting containers with shared storage and network.
Deployments: Manage stateless pod replicas with rolling updates and auto-scaling.
StatefulSets: Manage stateful pods with stable identities and ordered scaling for databases.
Services: Provide stable endpoints (e.g., ClusterIP, LoadBalancer) for pods, integrated with AWS ELB.
Ingress (ALB): Routes external traffic via AWS Application Load Balancer with path-based routing and TLS.
EKS Control Plane: Managed API Server, Scheduler, Controller Manager, and etcd for cluster orchestration.
Worker Nodes: EC2 instances or Auto Scaling Groups running pods, managed via node groups.
IAM Integration: IAM Roles for Service Accounts (IRSA) secure pod access to AWS services.
VPC Networking: AWS VPC CNI plugin assigns pod IPs from VPC subnets for seamless networking.
Storage: EBS and EFS provide persistent volumes via CSI drivers and StorageClasses.
Observability: CloudWatch, Prometheus, and X-Ray monitor cluster and application metrics/logs.

Benefits of EKS Orchestration

EKS delivers significant advantages for cloud-native applications:

Managed Control Plane: AWS ensures high availability, upgrades, and patching of Kubernetes masters.
Dynamic Scaling: Horizontal Pod Autoscaler and Cluster Autoscaler adjust pods and nodes based on demand.
Security: IRSA, RBAC, Network Policies, and VPC isolation secure workloads.
AWS Integration: Native support for S3, RDS, SQS, and other services via SDKs and IRSA.
Self-Healing: Automatically restarts, reschedules, or replaces failed pods/nodes.
Observability: Comprehensive monitoring with CloudWatch, Prometheus, and X-Ray.
Portability: Kubernetes standards and EKS Anywhere enable multi-cloud and hybrid deployments.
Developer Productivity: Managed infrastructure reduces operational burden for developers.

Implementation Considerations

Deploying EKS effectively requires addressing key considerations:

Cluster Sizing: Configure node groups and Cluster Autoscaler for workload variability.
Security Hardening: Use IRSA, RBAC, Network Policies, Pod Security Standards, and KMS encryption.
Networking Design: Plan VPC subnets, CNI settings, and security groups for pod communication.
Resource Optimization: Set pod requests/limits, use spot instances, and apply Savings Plans.
Observability Setup: Deploy Prometheus/Grafana, CloudWatch, and X-Ray for metrics, logs, and traces.
CI/CD Integration: Use ArgoCD, GitHub Actions, or CodePipeline for automated deployments.
Storage Management: Configure StorageClasses for EBS/EFS and manage persistent volume claims.
Cost Monitoring: Track usage with AWS Cost Explorer and optimize node types/sizes.
Resilience Testing: Use chaos engineering (e.g., Chaos Mesh) to validate failover mechanisms.
Compliance Requirements: Enable CloudTrail, encrypt data, and align with standards like SOC 2 or HIPAA.

Strategic planning for security, observability, and cost ensures robust EKS deployments.

Troubleshooting Common EKS Issues

Common EKS issues and their resolutions:

Pod Pending: Check node capacity, resource limits, or CNI IP exhaustion; scale nodes or adjust VPC CNI settings.
Access Denied: Verify IRSA or RBAC configurations; ensure IAM roles have correct trust policies.
Network Errors: Inspect security groups, Network Policies, or VPC routing; validate CNI plugin.
High Latency: Analyze X-Ray traces or Prometheus metrics; optimize pod resources or scale nodes.
Storage Failures: Confirm EBS CSI driver installation and StorageClass definitions; check volume attachments.

Proactive monitoring and configuration validation minimize EKS operational issues.

Example Configuration: StatefulSet for MongoDB

Below is a Kubernetes StatefulSet, Service, and StorageClass for a MongoDB deployment on EKS.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs-sc
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  encrypted: "true"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: Service
metadata:
  name: mongodb-service
  namespace: default
spec:
  selector:
    app: mongodb
  ports:
  - protocol: TCP
    port: 27017
    targetPort: 27017
  clusterIP: None
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongodb
  namespace: default
spec:
  serviceName: mongodb-service
  replicas: 3
  selector:
    matchLabels:
      app: mongodb
  template:
    metadata:
      labels:
        app: mongodb
    spec:
      containers:
      - name: mongodb
        image: mongo:5.0
        ports:
        - containerPort: 27017
        resources:
          requests:
            cpu: "200m"
            memory: "256Mi"
          limits:
            cpu: "1000m"
            memory: "1Gi"
        volumeMounts:
        - name: mongo-data
          mountPath: /data/db
        livenessProbe:
          exec:
            command: ["mongo", "--eval", "db.adminCommand('ping')"]
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command: ["mongo", "--eval", "db.adminCommand('ping')"]
          initialDelaySeconds: 5
          periodSeconds: 5
      serviceAccountName: mongodb-sa
  volumeClaimTemplates:
  - metadata:
      name: mongo-data
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: ebs-sc
      resources:
        requests:
          storage: 10Gi

This StatefulSet deploys a MongoDB cluster with persistent EBS storage and stable network identities.

Example Configuration: Horizontal Pod Autoscaler

Below is a Kubernetes HorizontalPodAutoscaler configuration to scale a Deployment based on CPU usage.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app-deployment
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

This HPA scales the my-app-deployment based on 70% CPU utilization, ensuring performance under load.

Example Configuration: IAM Role for Service Account (IRSA)

Below is a Terraform configuration to create an IAM role for an EKS service account.

provider "aws" {
  region = "us-west-2"
}

data "aws_eks_cluster" "eks_cluster" {
  name = "my-eks-cluster"
}

data "aws_caller_identity" "current" {}

resource "aws_iam_role" "my_app_sa_role" {
  name = "my-app-sa-role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Principal = {
          Federated = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/${replace(data.aws_eks_cluster.eks_cluster.identity[0].oidc[0].issuer, "https://", "")}"
        }
        Action = "sts:AssumeRoleWithWebIdentity"
        Condition = {
          StringEquals = {
            "${replace(data.aws_eks_cluster.eks_cluster.identity[0].oidc[0].issuer, "https://", "")}:sub" = "system:serviceaccount:default:my-app-sa"
          }
        }
      }
    ]
  })
}

resource "aws_iam_role_policy" "my_app_sa_policy" {
  name = "my-app-sa-policy"
  role = aws_iam_role.my_app_sa_role.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "s3:GetObject",
          "s3:PutObject"
        ]
        Resource = "arn:aws:s3:::my-app-bucket/*"
      }
    ]
  })
}

resource "kubernetes_service_account" "my_app_sa" {
  provider = kubernetes
  metadata {
    name      = "my-app-sa"
    namespace = "default"
    annotations = {
      "eks.amazonaws.com/role-arn" = aws_iam_role.my_app_sa_role.arn
    }
  }
}

This Terraform configuration enables pods to securely access S3 using IRSA.

Example Configuration: Cluster Autoscaler

Below is a Kubernetes configuration to deploy the Cluster Autoscaler on EKS.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/ClusterAutoscalerRole
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
      - name: cluster-autoscaler
        image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.21.0
        resources:
          requests:
            cpu: "100m"
            memory: "300Mi"
          limits:
            cpu: "500m"
            memory: "600Mi"
        command:
        - ./cluster-autoscaler
        - --v=4
        - --cloud-provider=aws
        - --aws-region=us-west-2
        - --expander=least-waste
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-eks-cluster
        env:
        - name: AWS_REGION
          value: us-west-2

This configuration deploys the Cluster Autoscaler to scale EKS node groups based on pod demand.

Example Configuration: CI/CD with GitHub Actions

Below is a GitHub Actions workflow to deploy a Kubernetes application to EKS.

name: Deploy to EKS
on:
  push:
    branches:
      - main
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout code
      uses: actions/checkout@v3
    - name: Configure AWS credentials
      uses: aws-actions/configure-aws-credentials@v2
      with:
        aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
        aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        aws-region: us-west-2
    - name: Update kubeconfig
      run: aws eks update-kubeconfig --name my-eks-cluster --region us-west-2
    - name: Deploy to EKS
      run: kubectl apply -f k8s/deployment.yaml

This GitHub Actions workflow automates deployments to EKS, aligning with DevOps practices.

Example Configuration: Network Policy

Below is a Kubernetes NetworkPolicy to restrict pod traffic.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: my-app-network-policy
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: my-app
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: mongodb
    ports:
    - protocol: TCP
      port: 27017

This NetworkPolicy restricts traffic to and from my-app pods for enhanced security.

Example Configuration: CloudWatch and Prometheus Monitoring

Below is a Helm values file to deploy Prometheus for EKS monitoring, integrated with CloudWatch.

prometheus:
  serviceMonitor:
    enabled: true
  prometheusSpec:
    serviceMonitorSelectorNilUsesHelmValues: false
    additionalScrapeConfigs:
    - job_name: kubernetes-pods
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
    - job_name: cloudwatch
      static_configs:
      - targets: ['cloudwatch.amazonaws.com']
      metrics_path: /metrics
      scheme: https
      authorization:
        credentials: ${AWS_ACCESS_KEY_ID}:${AWS_SECRET_ACCESS_KEY}
      params:
        region: [us-west-2]
grafana:
  enabled: true
  adminPassword: admin
  service:
    type: LoadBalancer

This Helm configuration deploys Prometheus and Grafana, scraping EKS pod metrics and CloudWatch data.