AWS Kubernetes Orchestration with EKS
Introduction to EKS Orchestration
Amazon Elastic Kubernetes Service (EKS) is AWS's managed Kubernetes platform that orchestrates containerized applications through a complete control plane (API Server, Scheduler, Controller Manager) and worker nodes (EC2 or Fargate). The architecture manages workloads via Deployments for stateless apps and StatefulSets for stateful services, deployed as Pods running application Containers. Traffic flows through Services and ALB Ingress controllers, secured by IAM roles and VPC networking. Persistent storage options include EBS volumes and EFS filesystems, while observability is provided through CloudWatch metrics and Prometheus monitoring. The platform integrates with Secrets Manager for credentials and supports hybrid architectures with both EC2 and serverless Fargate compute options, offering a fully managed solution for microservices, batch processing, and machine learning workloads.
Use Cases
EKS supports a variety of cloud-native workloads:
- Microservices: Deploy decoupled services with Deployments and Ingress for scalable APIs.
- Machine Learning: Orchestrate ML pipelines with Kubeflow on EKS for training and inference.
- Stateful Applications: Run databases like PostgreSQL or MongoDB using StatefulSets with EBS.
- Batch Processing: Manage data processing jobs with Kubernetes Jobs and CronJobs.
- Hybrid Deployments: Use EKS Anywhere for consistent Kubernetes across on-premises and AWS.
EKS Architecture Diagram
The diagram illustrates an Amazon EKS cluster architecture with the following components:
Core Components
Control Plane: Fully managed by AWS, includes API Server, Scheduler, and Controller ManagerWorker Nodes: Runs in customerVPCas either EC2 instances or Fargate podsDeployments: Manage stateless applications via ReplicaSetsStatefulSets: Manage stateful applications with persistent storagePods: Smallest deployable units running one or moreContainers
Networking
Services: Internal load balancing (ClusterIP) and external exposure (NodePort/LoadBalancer)ALB Ingress: AWS Application Load Balancer for HTTP traffic routingVPC: Provides network isolation and security groups
AWS Integrations
IAM: Provides fine-grained access control via IRSA (IAM Roles for Service Accounts)EBS/EFS: Block storage (EBS) and shared filesystem (EFS) for persistent volumesCloudWatch: Centralized monitoring and loggingSecrets Manager: Secure credential storage and rotation
Key Features
- Multi-tenant architecture with namespace isolation
- Both EC2 and Fargate compute options
- Integration with AWS security services (IAM, Secrets Manager)
- End-to-end observability (CloudWatch, Prometheus)
- High availability through multiple availability zones
Key EKS Components
EKS leverages Kubernetes components with AWS-specific integrations:
- Pods: Smallest deployable units hosting containers with shared storage and network.
- Deployments: Manage stateless pod replicas with rolling updates and auto-scaling.
- StatefulSets: Manage stateful pods with stable identities and ordered scaling for databases.
- Services: Provide stable endpoints (e.g., ClusterIP, LoadBalancer) for pods, integrated with AWS ELB.
- Ingress (ALB): Routes external traffic via AWS Application Load Balancer with path-based routing and TLS.
- EKS Control Plane: Managed API Server, Scheduler, Controller Manager, and etcd for cluster orchestration.
- Worker Nodes: EC2 instances or Auto Scaling Groups running pods, managed via node groups.
- IAM Integration: IAM Roles for Service Accounts (IRSA) secure pod access to AWS services.
- VPC Networking: AWS VPC CNI plugin assigns pod IPs from VPC subnets for seamless networking.
- Storage: EBS and EFS provide persistent volumes via CSI drivers and StorageClasses.
- Observability: CloudWatch, Prometheus, and X-Ray monitor cluster and application metrics/logs.
Benefits of EKS Orchestration
EKS delivers significant advantages for cloud-native applications:
- Managed Control Plane: AWS ensures high availability, upgrades, and patching of Kubernetes masters.
- Dynamic Scaling: Horizontal Pod Autoscaler and Cluster Autoscaler adjust pods and nodes based on demand.
- Security: IRSA, RBAC, Network Policies, and VPC isolation secure workloads.
- AWS Integration: Native support for S3, RDS, SQS, and other services via SDKs and IRSA.
- Self-Healing: Automatically restarts, reschedules, or replaces failed pods/nodes.
- Observability: Comprehensive monitoring with CloudWatch, Prometheus, and X-Ray.
- Portability: Kubernetes standards and EKS Anywhere enable multi-cloud and hybrid deployments.
- Developer Productivity: Managed infrastructure reduces operational burden for developers.
Implementation Considerations
Deploying EKS effectively requires addressing key considerations:
- Cluster Sizing: Configure node groups and Cluster Autoscaler for workload variability.
- Security Hardening: Use IRSA, RBAC, Network Policies, Pod Security Standards, and KMS encryption.
- Networking Design: Plan VPC subnets, CNI settings, and security groups for pod communication.
- Resource Optimization: Set pod requests/limits, use spot instances, and apply Savings Plans.
- Observability Setup: Deploy Prometheus/Grafana, CloudWatch, and X-Ray for metrics, logs, and traces.
- CI/CD Integration: Use ArgoCD, GitHub Actions, or CodePipeline for automated deployments.
- Storage Management: Configure StorageClasses for EBS/EFS and manage persistent volume claims.
- Cost Monitoring: Track usage with AWS Cost Explorer and optimize node types/sizes.
- Resilience Testing: Use chaos engineering (e.g., Chaos Mesh) to validate failover mechanisms.
- Compliance Requirements: Enable CloudTrail, encrypt data, and align with standards like SOC 2 or HIPAA.
Troubleshooting Common EKS Issues
Common EKS issues and their resolutions:
- Pod Pending: Check node capacity, resource limits, or CNI IP exhaustion; scale nodes or adjust VPC CNI settings.
- Access Denied: Verify IRSA or RBAC configurations; ensure IAM roles have correct trust policies.
- Network Errors: Inspect security groups, Network Policies, or VPC routing; validate CNI plugin.
- High Latency: Analyze X-Ray traces or Prometheus metrics; optimize pod resources or scale nodes.
- Storage Failures: Confirm EBS CSI driver installation and StorageClass definitions; check volume attachments.
Example Configuration: StatefulSet for MongoDB
Below is a Kubernetes StatefulSet, Service, and StorageClass for a MongoDB deployment on EKS.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ebs-sc
provisioner: ebs.csi.aws.com
parameters:
type: gp3
encrypted: "true"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
namespace: default
spec:
selector:
app: mongodb
ports:
- protocol: TCP
port: 27017
targetPort: 27017
clusterIP: None
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongodb
namespace: default
spec:
serviceName: mongodb-service
replicas: 3
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: mongo:5.0
ports:
- containerPort: 27017
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "1Gi"
volumeMounts:
- name: mongo-data
mountPath: /data/db
livenessProbe:
exec:
command: ["mongo", "--eval", "db.adminCommand('ping')"]
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command: ["mongo", "--eval", "db.adminCommand('ping')"]
initialDelaySeconds: 5
periodSeconds: 5
serviceAccountName: mongodb-sa
volumeClaimTemplates:
- metadata:
name: mongo-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: ebs-sc
resources:
requests:
storage: 10Gi
Example Configuration: Horizontal Pod Autoscaler
Below is a Kubernetes HorizontalPodAutoscaler configuration to scale a Deployment based on CPU usage.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Example Configuration: IAM Role for Service Account (IRSA)
Below is a Terraform configuration to create an IAM role for an EKS service account.
provider "aws" {
region = "us-west-2"
}
data "aws_eks_cluster" "eks_cluster" {
name = "my-eks-cluster"
}
data "aws_caller_identity" "current" {}
resource "aws_iam_role" "my_app_sa_role" {
name = "my-app-sa-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Principal = {
Federated = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/${replace(data.aws_eks_cluster.eks_cluster.identity[0].oidc[0].issuer, "https://", "")}"
}
Action = "sts:AssumeRoleWithWebIdentity"
Condition = {
StringEquals = {
"${replace(data.aws_eks_cluster.eks_cluster.identity[0].oidc[0].issuer, "https://", "")}:sub" = "system:serviceaccount:default:my-app-sa"
}
}
}
]
})
}
resource "aws_iam_role_policy" "my_app_sa_policy" {
name = "my-app-sa-policy"
role = aws_iam_role.my_app_sa_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:PutObject"
]
Resource = "arn:aws:s3:::my-app-bucket/*"
}
]
})
}
resource "kubernetes_service_account" "my_app_sa" {
provider = kubernetes
metadata {
name = "my-app-sa"
namespace = "default"
annotations = {
"eks.amazonaws.com/role-arn" = aws_iam_role.my_app_sa_role.arn
}
}
}
Example Configuration: Cluster Autoscaler
Below is a Kubernetes configuration to deploy the Cluster Autoscaler on EKS.
apiVersion: v1
kind: ServiceAccount
metadata:
name: cluster-autoscaler
namespace: kube-system
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/ClusterAutoscalerRole
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
spec:
serviceAccountName: cluster-autoscaler
containers:
- name: cluster-autoscaler
image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.21.0
resources:
requests:
cpu: "100m"
memory: "300Mi"
limits:
cpu: "500m"
memory: "600Mi"
command:
- ./cluster-autoscaler
- --v=4
- --cloud-provider=aws
- --aws-region=us-west-2
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-eks-cluster
env:
- name: AWS_REGION
value: us-west-2
Example Configuration: CI/CD with GitHub Actions
Below is a GitHub Actions workflow to deploy a Kubernetes application to EKS.
name: Deploy to EKS
on:
push:
branches:
- main
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-west-2
- name: Update kubeconfig
run: aws eks update-kubeconfig --name my-eks-cluster --region us-west-2
- name: Deploy to EKS
run: kubectl apply -f k8s/deployment.yaml
Example Configuration: Network Policy
Below is a Kubernetes NetworkPolicy to restrict pod traffic.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: my-app-network-policy
namespace: default
spec:
podSelector:
matchLabels:
app: my-app
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: mongodb
ports:
- protocol: TCP
port: 27017
Example Configuration: CloudWatch and Prometheus Monitoring
Below is a Helm values file to deploy Prometheus for EKS monitoring, integrated with CloudWatch.
prometheus:
serviceMonitor:
enabled: true
prometheusSpec:
serviceMonitorSelectorNilUsesHelmValues: false
additionalScrapeConfigs:
- job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- job_name: cloudwatch
static_configs:
- targets: ['cloudwatch.amazonaws.com']
metrics_path: /metrics
scheme: https
authorization:
credentials: ${AWS_ACCESS_KEY_ID}:${AWS_SECRET_ACCESS_KEY}
params:
region: [us-west-2]
grafana:
enabled: true
adminPassword: admin
service:
type: LoadBalancer
