The Problem: Manual EKS AMI Updates Are a Pain

Every EKS team knows the drill: a new AMI drops, you manually find the ID, read release notes, assess CVEs, draft a PR, wait for approvals, then carefully roll out nodes. It's slow, error-prone, and easy to deprioritize. The result? Outdated nodes, security gaps, and 2 AM surprises.

The Solution: A Fully Automated Pipeline

Suryansh639 built a pipeline that runs twice daily (9 AM and 9 PM UTC) via EventBridge. It detects new EKS-optimized AMIs by querying AWS SSM Parameter Store (/aws/service/eks/optimized-ami/1.34/amazon-linux-2023/recommended/image_id), compares against the current AMI in your Git repo, and if different, triggers a Step Functions workflow.

Phase 1: Detection

A Lambda fetches the latest AMI ID from SSM and checks your GitHub repository (your source of truth). No new AMI? The Lambda exits silently. Difference detected? The Step Functions state machine kicks off.

Phase 2: AI Analysis + Pull Request

Step Functions orchestrates three Lambdas in sequence:

  1. bedrock-analyzer: Fetches actual release notes from awslabs/amazon-eks-ami and sends them to Amazon Bedrock (Claude 3.5 Haiku) with a prompt that asks for a JSON response containing risk_score (1-10), recommendation (APPROVE/REJECT), summary, and pr_description (full markdown).

  2. gitops-updater: Uses GitHub App credentials from Secrets Manager to create a branch, update the Karpenter EC2NodeClass YAML with the new AMI ID, and open a PR with the Bedrock analysis as the description.

  3. send-notification: Emails the team via SNS with the PR link and AI summary.

The human's only job: read the AI analysis, check the one-line YAML diff, and merge (to approve) or close (to reject).

Phase 3: GitOps Deployment

Once merged:

What the PR Looks Like

The PR description includes the AI's analysis, e.g.:

## EKS AMI Update — ami-04b406d4e6eaca578
**AI Risk Score: 2/10 — APPROVE**
### What changed
- Go updated to 1.25.9
- Kernel updated to 6.12.79-101.147.amzn2023
- No new CVEs introduced
### CVE Assessment
No critical or high-severity CVEs. Two previously known CVEs patched.

Your reviewer doesn't need to dig through release notes — the AI already did.

Deployment: Single CloudFormation Stack

The entire solution deploys from one CloudFormation template. It provisions:

Deploy with:

aws cloudformation create-stack \
--stack-name eks-ami-update \
--template-body file://cloudformation-template.yaml \
--capabilities CAPABILITY_NAMED_IAM \
--parameters \
ParameterKey=NotificationEmail,Value=your@email.com \
ParameterKey=GitHubAppId,Value= \
ParameterKey=GitHubAppInstallationId,Value= \
ParameterKey=GitHubAppPrivateKey,Value=$(base64 -i app.pem | tr -d '\n') \
ParameterKey=GitHubRepoOwner,Value= \
ParameterKey=GitHubRepoName,Value= \
ParameterKey=GitHubFilePath,Value=karpenter-configs/clusters/your-cluster/nodeclass.yaml \
ParameterKey=GitHubBranch,Value=main \
ParameterKey=EKSVersion,Value=1.34

Takes about 2-3 minutes. Confirm the SNS subscription email.

Prerequisites

Testing the Pipeline

Trigger the detector manually:

aws lambda invoke \
--function-name eks-ami-detector \
--payload '{}' \
--cli-binary-format raw-in-base64-out \
/tmp/response.json && cat /tmp/response.json

You should get an SNS email with the risk analysis and PR link within minutes.

Common Issues

Why This Architecture Works

What's Next

The author suggests adding:

Get the Code

Fork the repo: suryansh639/sample-eks-ami-gitops-pipeline. The CloudFormation template, Lambda code, and example Karpenter configs are all there.

The Right Split

The goal wasn't to remove humans — it was to remove the boring part. AI reads release notes, writes PR description. Human decides. Automation executes. Your nodes get updated on time, every time, with a full audit trail and no 2 AM surprises.