Production-Level EKS Cluster Setup Guide
Introduction
This guide is for developers who understand the basics of Kubernetes but have never built an AWS EKS cluster in practice.
Amazon EKS (Elastic Kubernetes Service) is a managed Kubernetes service provided by AWS. Running a Kubernetes cluster on your own requires installing and managing the control plane (API Server, etcd, Scheduler, etc.) yourself, but EKS handles that part for you.
The key advantages of EKS over self-managed Kubernetes are:
- Managed Control Plane: AWS guarantees etcd backups and API Server high availability
- AWS Service Integration: Native integration with IAM, ALB, Route53, ACM, and more
- Easy Upgrades: Cluster version upgrades can be performed safely and incrementally
- Security: Leverages AWS VPC networking and Security Groups as-is
This guide starts with quick cluster creation using eksctl, then moves on to IaC (Infrastructure as Code) with Terraform, and covers the essential networking setup (ALB, DNS, HTTPS) needed in production.
This post is Part 1 of the series. Part 2 will cover building a GitOps-based deployment pipeline using ArgoCD.
Prerequisites
Before building an EKS cluster, install the necessary tools in your local environment.
1. Install and Configure AWS CLI
# macOS
brew install awscli
# Linux
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
# Verify installation
aws --version
Configure your AWS credentials:
aws configure
# AWS Access Key ID: [Enter your access key]
# AWS Secret Access Key: [Enter your secret key]
# Default region name: ap-northeast-2
# Default output format: json
2. Install kubectl
# macOS
brew install kubectl
# Linux
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin/
# Verify installation
kubectl version --client
3. Install eksctl
# macOS
brew tap weaveworks/tap
brew install weaveworks/tap/eksctl
# Linux
ARCH=amd64
PLATFORM=$(uname -s)_$ARCH
curl -sLO "https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_$PLATFORM.tar.gz"
tar -xzf eksctl_$PLATFORM.tar.gz -C /tmp && sudo mv /tmp/eksctl /usr/local/bin
# Verify installation
eksctl version
4. Install Terraform
# macOS
brew tap hashicorp/tap
brew install hashicorp/tap/terraform
# Linux
wget -O - https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform
# Verify installation
terraform version
5. IAM Permission Setup
The IAM user/role creating the EKS cluster needs at least the following permissions:
eks:*- EKS cluster managementec2:*- VPC, subnet, security group, and EC2 instance managementiam:CreateRole,iam:AttachRolePolicy,iam:PassRole- Role creation and attachmentcloudformation:*- Used internally by eksctl via CloudFormationautoscaling:*- Node group auto scaling
Caution: In production environments, avoid using
AdministratorAccess. The security best practice is to grant only the minimum required permissions. Use broad permissions only during initial setup, then gradually reduce them over time.
Getting Started Quickly with eksctl
eksctl is a CLI tool for quickly creating EKS clusters. It internally creates AWS CloudFormation stacks to provision the necessary resources.
Creating a Cluster
The simplest approach is to create a cluster with a single command:
eksctl create cluster \
--name my-eks-cluster \
--region ap-northeast-2 \
--version 1.31 \
--nodegroup-name my-nodes \
--node-type t3.medium \
--nodes 2 \
--nodes-min 1 \
--nodes-max 4 \
--managed
Here is what each option means:
| Option | Description |
|---|---|
--name | Cluster name |
--region | AWS region |
--version | Kubernetes version |
--nodegroup-name | Node group name |
--node-type | EC2 instance type |
--nodes | Initial number of nodes |
--nodes-min/max | Auto scaling minimum/maximum node count |
--managed | Use AWS managed node groups |
Cluster creation typically takes about 15-20 minutes.
For more granular configuration, you can use a YAML config file:
# cluster-config.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: my-eks-cluster
region: ap-northeast-2
version: "1.31"
managedNodeGroups:
- name: app-nodes
instanceType: t3.medium
desiredCapacity: 2
minSize: 1
maxSize: 4
volumeSize: 30
labels:
role: application
tags:
Environment: production
iam:
withAddonPolicies:
albIngress: true
cloudWatch: true
eksctl create cluster -f cluster-config.yaml
Verifying kubeconfig
eksctl automatically configures kubeconfig after cluster creation. You can immediately access the cluster with kubectl:
# Check current context
kubectl config current-context
# List nodes
kubectl get nodes
# Example output:
# NAME STATUS ROLES AGE VERSION
# ip-192-168-25-47.ap-northeast-2.compute.internal Ready <none> 5m v1.31.2-eks-abcdef
# ip-192-168-68-112.ap-northeast-2.compute.internal Ready <none> 5m v1.31.2-eks-abcdef
If you need to manually refresh the kubeconfig:
aws eks update-kubeconfig --name my-eks-cluster --region ap-northeast-2
Deploying a Sample App to Verify
Deploy nginx to verify the cluster is working properly:
# nginx-test.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-test
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.27
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-test
spec:
type: LoadBalancer
selector:
app: nginx
ports:
- port: 80
targetPort: 80
kubectl apply -f nginx-test.yaml
# Check deployment status
kubectl get pods
kubectl get svc nginx-test
# Once an EXTERNAL-IP is assigned, verify access in your browser
# (A CLB is created, which takes 1-2 minutes)
# Clean up after verification
kubectl delete -f nginx-test.yaml
Limitations of eksctl
eksctl is great for getting started quickly, but it has limitations in production environments:
- No State Management: Relies on CloudFormation, making it difficult to track infrastructure state as code
- No Code Review: Changes cannot be reviewed through PRs
- Poor Reproducibility: Reproducing the same environment across multiple accounts/regions is cumbersome
- Limited Fine-Grained Control: Difficult to precisely control network settings like VPCs and subnets
For these reasons, Terraform is the standard choice for production environments.
Transitioning to IaC with Terraform
Why Terraform
Terraform is a tool for managing infrastructure as code. Using Terraform for EKS cluster setup provides:
- State Management: Infrastructure state is precisely tracked via
terraform.tfstate - Code Review: Infrastructure changes can be reviewed by team members through PRs
- Reproducibility: The same code consistently builds dev/staging/prod environments
- Plan Feature: Preview what changes will occur before actually applying them
Project Structure
eks-terraform/
├── main.tf # Provider, data sources
├── vpc.tf # VPC module
├── eks.tf # EKS module
├── variables.tf # Variable definitions
├── outputs.tf # Output definitions
└── terraform.tfvars # Variable values (should be in .gitignore)
Provider and Variable Configuration
# main.tf
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
# In practice, use an S3 backend to manage state files remotely
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "eks/terraform.tfstate"
region = "ap-northeast-2"
}
}
provider "aws" {
region = var.region
}
# variables.tf
variable "region" {
description = "AWS region"
type = string
default = "ap-northeast-2"
}
variable "cluster_name" {
description = "EKS cluster name"
type = string
default = "my-eks-cluster"
}
variable "cluster_version" {
description = "Kubernetes version"
type = string
default = "1.31"
}
variable "environment" {
description = "Environment name"
type = string
default = "production"
}
VPC Module Configuration
An EKS cluster runs inside a VPC. The standard practice is to separate public subnets (for load balancers) and private subnets (for worker nodes).
# vpc.tf
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = "${var.cluster_name}-vpc"
cidr = "10.0.0.0/16"
azs = ["${var.region}a", "${var.region}b", "${var.region}c"]
public_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
private_subnets = ["10.0.11.0/24", "10.0.12.0/24", "10.0.13.0/24"]
# Enable outbound connectivity from private subnets via NAT Gateway
enable_nat_gateway = true
single_nat_gateway = true # Cost savings (for production, set to false for per-AZ NAT)
enable_dns_hostnames = true
enable_dns_support = true
# Tags required for EKS to identify subnets (mandatory)
public_subnet_tags = {
"kubernetes.io/role/elb" = 1
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
}
private_subnet_tags = {
"kubernetes.io/role/internal-elb" = 1
"kubernetes.io/cluster/${var.cluster_name}" = "shared"
}
tags = {
Environment = var.environment
Terraform = "true"
}
}
Key Point: Without subnet tags, the AWS Load Balancer Controller cannot discover the subnets. The
kubernetes.io/role/elb(public) andkubernetes.io/role/internal-elb(private) tags must be configured.
EKS Module Configuration
# eks.tf
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.0"
cluster_name = var.cluster_name
cluster_version = var.cluster_version
# Cluster endpoint access configuration
cluster_endpoint_public_access = true
cluster_endpoint_private_access = true
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
# Cluster add-ons
cluster_addons = {
coredns = {
most_recent = true
}
kube-proxy = {
most_recent = true
}
vpc-cni = {
most_recent = true
}
aws-ebs-csi-driver = {
most_recent = true
service_account_role_arn = module.ebs_csi_irsa.iam_role_arn
}
}
# Managed node group definitions
eks_managed_node_groups = {
# Node group for general workloads
application = {
name = "app-nodes"
instance_types = ["t3.medium"]
capacity_type = "ON_DEMAND"
min_size = 2
max_size = 6
desired_size = 2
disk_size = 30
labels = {
role = "application"
}
tags = {
Environment = var.environment
}
}
}
# Grant admin permissions to the cluster creator
enable_cluster_creator_admin_permissions = true
tags = {
Environment = var.environment
Terraform = "true"
}
}
# IRSA for EBS CSI Driver
module "ebs_csi_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "~> 5.0"
role_name = "${var.cluster_name}-ebs-csi-driver"
attach_ebs_csi_policy = true
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["kube-system:ebs-csi-controller-sa"]
}
}
}
# outputs.tf
output "cluster_endpoint" {
description = "EKS cluster endpoint"
value = module.eks.cluster_endpoint
}
output "cluster_name" {
description = "EKS cluster name"
value = module.eks.cluster_name
}
output "cluster_certificate_authority_data" {
description = "Base64 encoded certificate data"
value = module.eks.cluster_certificate_authority_data
}
output "vpc_id" {
description = "VPC ID"
value = module.vpc.vpc_id
}
Managed Node Groups vs Self-Managed Node Groups
| Category | Managed Node Groups | Self-Managed Node Groups |
|---|---|---|
| Node Updates | Rolling updates via AWS Console/API | Manage Launch Templates manually |
| AMI Management | Automatically uses AWS-optimized AMIs | Select and manage AMIs yourself |
| Scaling | Easy management via EKS API | Configure ASG directly |
| Customization | Limited (extensible via Launch Templates) | Full flexibility |
| Recommended For | Most workloads | GPU, custom AMIs, and advanced requirements |
Unless you have a specific reason, using managed node groups is recommended.
Terraform Workflow
# 1. Initialize - Download providers and modules
terraform init
# 2. Plan - Preview what resources will be created/changed/destroyed
terraform plan
# Example output:
# Plan: 45 to add, 0 to change, 0 to destroy.
# 3. Apply - Actually create resources (takes about 15-20 minutes)
terraform apply
# 4. Configure kubeconfig
aws eks update-kubeconfig --name my-eks-cluster --region ap-northeast-2
# 5. Verify
kubectl get nodes
eksctl vs Terraform Comparison
| Category | eksctl | Terraform |
|---|---|---|
| Learning Curve | Low | Medium |
| Speed | Quick with a single command | Write code then apply |
| State Management | CloudFormation | terraform.tfstate |
| Code Review | Difficult | PR-based review possible |
| Reproducibility | Low | High |
| Fine-Grained Control | Limited | Full control over VPC, IAM, etc. |
| Team Collaboration | Inconvenient | Remote state + locking for collaboration |
| Best Suited For | PoC, learning, quick testing | Production, team operations |
Production Networking Setup
Once the cluster is created, you need networking configuration to handle external traffic. This section covers ALB (Application Load Balancer), DNS, and HTTPS setup.
1. Installing AWS Load Balancer Controller
The AWS Load Balancer Controller detects Kubernetes Ingress resources and automatically creates AWS ALBs.
IRSA Configuration
The controller needs IAM permissions to manage AWS resources (ALBs, Target Groups, etc.). IRSA (IAM Roles for Service Accounts) is used to grant permissions at the Pod level.
# alb-controller.tf (add to Terraform)
module "lb_controller_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "~> 5.0"
role_name = "${var.cluster_name}-aws-lb-controller"
attach_load_balancer_controller_policy = true
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["kube-system:aws-load-balancer-controller"]
}
}
}
output "lb_controller_role_arn" {
value = module.lb_controller_irsa.iam_role_arn
}
terraform apply
Installing the Controller with Helm
# Add Helm repository
helm repo add eks https://aws.github.io/eks-charts
helm repo update
# Install AWS Load Balancer Controller
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=my-eks-cluster \
--set serviceAccount.create=true \
--set serviceAccount.name=aws-load-balancer-controller \
--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=$(terraform output -raw lb_controller_role_arn) \
--set region=ap-northeast-2 \
--set vpcId=$(terraform output -raw vpc_id)
# Verify installation
kubectl get deployment -n kube-system aws-load-balancer-controller
2. Creating an ALB with Ingress Resources
Once the AWS Load Balancer Controller is installed, you can create ALBs by defining Ingress resources.
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
namespace: default
annotations:
# Specify to create an ALB
alb.ingress.kubernetes.io/scheme: internet-facing
# Create ALB in public subnets
alb.ingress.kubernetes.io/target-type: ip
# Health check path
alb.ingress.kubernetes.io/healthcheck-path: /health
# Listener port
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]'
spec:
ingressClassName: alb
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app-service
port:
number: 80
Here is a summary of the key annotations:
| Annotation | Description |
|---|---|
alb.ingress.kubernetes.io/scheme | internet-facing (public) or internal (private) |
alb.ingress.kubernetes.io/target-type | ip (direct Pod IP) or instance (via NodePort) |
alb.ingress.kubernetes.io/healthcheck-path | Health check endpoint path |
alb.ingress.kubernetes.io/listen-ports | Listener ports and protocols |
alb.ingress.kubernetes.io/certificate-arn | ACM certificate ARN (for HTTPS) |
alb.ingress.kubernetes.io/ssl-redirect | HTTP to HTTPS redirect port |
kubectl apply -f ingress.yaml
# Verify ALB creation (takes 1-2 minutes)
kubectl get ingress app-ingress
3. External DNS Configuration
External DNS automatically registers the hostnames defined in Ingress resources into Route53. This eliminates the need to manage DNS records manually.
IRSA Configuration
# external-dns.tf
module "external_dns_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "~> 5.0"
role_name = "${var.cluster_name}-external-dns"
attach_external_dns_policy = true
external_dns_hosted_zone_arns = ["arn:aws:route53:::hostedzone/ZXXXXXXXXXX"] # Your actual hosted zone ARN
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["kube-system:external-dns"]
}
}
}
output "external_dns_role_arn" {
value = module.external_dns_irsa.iam_role_arn
}
Installing External DNS with Helm
helm repo add external-dns https://kubernetes-sigs.github.io/external-dns/
helm repo update
helm install external-dns external-dns/external-dns \
-n kube-system \
--set provider.name=aws \
--set policy=sync \
--set registry=txt \
--set txtOwnerId=my-eks-cluster \
--set domainFilters[0]=example.com \
--set serviceAccount.create=true \
--set serviceAccount.name=external-dns \
--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=$(terraform output -raw external_dns_role_arn)
# Verify
kubectl get pods -n kube-system -l app.kubernetes.io/name=external-dns
Now, when you set host: app.example.com in an Ingress resource,
External DNS will automatically create an A record (Alias pointing to the ALB’s DNS) in Route53.
4. Applying HTTPS with ACM Certificates
HTTPS is mandatory in production environments. Use AWS ACM (Certificate Manager) to issue free SSL certificates and apply them to the ALB.
Issuing a Certificate
# Request ACM certificate (DNS validation method recommended)
aws acm request-certificate \
--domain-name "*.example.com" \
--validation-method DNS \
--region ap-northeast-2
# Note the CertificateArn from the output
# arn:aws:acm:ap-northeast-2:123456789012:certificate/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Add the CNAME record provided by the ACM console to Route53 for DNS validation.
Once validation is complete, the certificate status changes to Issued.
Tip: You can also automate ACM certificate issuance and DNS validation with Terraform. Combine the
aws_acm_certificate+aws_route53_record+aws_acm_certificate_validationresources.
Configuring HTTPS on Ingress
# ingress-https.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
namespace: default
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/healthcheck-path: /health
# HTTPS configuration
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS": 443}]'
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:ap-northeast-2:123456789012:certificate/xxxxxxxx
# Automatic HTTP to HTTPS redirect
alb.ingress.kubernetes.io/ssl-redirect: "443"
spec:
ingressClassName: alb
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app-service
port:
number: 80
With this configuration:
- HTTP (port 80) requests are automatically redirected to HTTPS (port 443)
- The ACM certificate is used for HTTPS communication
- TLS termination occurs at the ALB, so Pods only need to handle HTTP
5. Do You Need Both an API Gateway and Ingress?
Ingress and API Gateway have overlapping roles, but they are not exactly the same. Depending on the situation, you might use just one or both.
Role Comparison
| Feature | Ingress | API Gateway |
|---|---|---|
| External traffic entry point | Yes | Yes |
| Domain/path routing | Yes | Yes |
| TLS/HTTPS termination | Yes | Yes |
| Authentication/Authorization | Limited | Yes |
| Rate Limiting | Limited | Yes |
| Request/Response transformation | No | Yes |
| Circuit Breaker | No | Yes |
| Fine-grained business routing | No | Yes |
Case-by-Case Decision
When Ingress alone is sufficient:
- Only simple path routing is needed
- Authentication is handled directly by each service
- Small-scale service architecture
When both are used (common in practice):
External Traffic
↓
Ingress (ALB) ← TLS termination, cluster entry
↓
API Gateway (Spring Cloud Gateway, etc.) ← Auth, Rate limit, routing
↓
Individual Microservices
Ingress serves as the infrastructure layer, while the API Gateway serves as the application layer, separating their responsibilities.
| Scenario | Recommendation |
|---|---|
| Simple services | Ingress (ALB) only |
| Complex auth/shared processing | Ingress + API Gateway |
| AWS ALB features are sufficient | ALB Ingress Controller alone can cover it |
AWS ALB natively provides features such as authentication (Cognito integration), fixed responses, and redirects. For simple cases, ALB alone may be sufficient without a separate API Gateway.
Operational Tips
Here are practical operational tips to keep in mind after building your cluster.
1. Cluster Version Upgrades
EKS cluster upgrades must follow the order: control plane first, then node groups.
# Step 1: Upgrade the control plane (when using Terraform)
# Change cluster_version in eks.tf, then
terraform apply
# Step 2: Upgrade node groups
# The AMI for eks_managed_node_groups is automatically updated to the latest version
# Nodes are replaced sequentially via rolling updates
Key considerations during upgrades:
- Only upgrade one minor version at a time (e.g., 1.30 to 1.31)
- Check for deprecated API usage before upgrading
- Set up PodDisruptionBudgets (PDBs) to prevent service interruptions
2. Node Group Management
To change instance types or settings of a node group, the safe approach is to create a new node group and delete the existing one:
# Add new node group -> Delete old node group (Blue/Green approach)
eks_managed_node_groups = {
# application_v1 = { ... } # Target for deletion (do not comment out first)
application_v2 = {
name = "app-nodes-v2"
instance_types = ["t3.large"] # Changed instance type
min_size = 2
max_size = 6
desired_size = 2
}
}
Order: Create new node group -> Verify workload migration -> Delete old node group
3. Cost Optimization
EKS costs are broadly divided into cluster costs ($0.10/hour) and EC2 instance costs. Here is how to reduce instance costs.
Using Spot Instances
For interruptible workloads, Spot instances can reduce costs by up to 90%:
eks_managed_node_groups = {
# On-Demand: For critical workloads
critical = {
instance_types = ["t3.medium"]
capacity_type = "ON_DEMAND"
min_size = 2
max_size = 4
desired_size = 2
labels = {
workload-type = "critical"
}
}
# Spot: For batch jobs, development environments, etc.
spot = {
instance_types = ["t3.medium", "t3.large", "t3a.medium", "t3a.large"]
capacity_type = "SPOT"
min_size = 0
max_size = 10
desired_size = 2
labels = {
workload-type = "flexible"
}
taints = [
{
key = "spot"
value = "true"
effect = "NO_SCHEDULE"
}
]
}
}
Spot Instance Tip: Specifying multiple types in
instance_typesincreases the probability of securing Spot capacity. Set taints so that only Pods with tolerations are scheduled on Spot nodes.
Brief Introduction to Karpenter
Karpenter is an open-source node autoscaler developed by AWS that provides faster and more flexible scaling than Cluster Autoscaler.
- Provisions the required instances immediately without node groups
- Automatically selects the optimal instance type based on Pod resource requests
- Automatically consolidates nodes for cost optimization
Consider adopting Karpenter once your cluster has reached a certain level of maturity.
4. Basic Monitoring Setup
To monitor cluster health, you should at minimum enable CloudWatch Container Insights.
# Install CloudWatch agent (Amazon CloudWatch Observability add-on)
aws eks create-addon \
--cluster-name my-eks-cluster \
--addon-name amazon-cloudwatch-observability \
--region ap-northeast-2
Once Container Insights is enabled, the following metrics are available in CloudWatch:
- CPU/memory utilization (cluster, node, and Pod level)
- Network traffic
- Container restart count
- Pod status
For more advanced monitoring, consider building a Prometheus + Grafana stack. This topic will be covered in a separate post.
Conclusion
Here is a summary of what this guide covered:
- eksctl: A CLI tool for quickly creating EKS clusters. Best suited for PoC and learning
- Terraform: Manage everything from VPC to EKS as code in production. Great for team collaboration and reproducibility
- Networking: Set up ALB, automatic DNS, and HTTPS using AWS Load Balancer Controller + External DNS + ACM
- Operations: Version upgrade order, node group management, cost optimization, and monitoring are the key areas
Building an EKS cluster is just the beginning. To deploy actual applications, you need a CI/CD pipeline.
Part 2 covers building a GitOps-based deployment pipeline using ArgoCD. It will include practical, immediately applicable content such as Helm chart management, environment-specific deployment strategies, and auto-sync configuration.