AWS Terraform GitHub Actions CI/CD CloudFormation Ansible OIDC

From Ansible to GitHub Actions + Terraform

How I replaced manual Ansible deploys and CloudFormation with GitHub Actions CI/CD and Terraform, with the site live throughout.

May 3, 2026

← Back to Blog

The before state wasn't broken. It just required me to babysit it. This doesn't.

From Ansible to GitHub Actions + Terraform

I shipped two infrastructure migrations in a single push last week. GitHub Actions for deployments, Terraform for infrastructure. The site never went down.

Here's what changed, why, and what bit me along the way.

Before: the manual deploy problem

The stack before this work wasn't bad. It just had friction.

Infrastructure was defined in CloudFormation templates: S3, CloudFront, Route53, RUM, Cognito, IAM. Deployments were triggered manually via Ansible playbooks. There was no CI/CD. Merging to main did exactly nothing.

Every deploy looked like this:

aws login
ansible-playbook upload.yml
ansible-playbook deploy.yml

aws login handled the session and Ansible picked it up automatically. The credentials weren't the friction. Everything else was. The process was tightly coupled to my machine, my memory, my manual trigger. Forget to run the playbook after a merge? The site is stale. Want someone else to deploy? They'd need to set up the same tooling from scratch.

It worked, but it was the kind of thing that breaks the moment you stop babysitting it.

Migration 1: GitHub Actions for deployment

The fix is a single workflow file. Push to main, site deploys. Done.

on:
  push:
    branches:
      - main

permissions:
  id-token: write   # OIDC JWT
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6.0.2
      - uses: aws-actions/configure-aws-credentials@v6.1.0
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          aws-region: ca-central-1
      - uses: actions/setup-python@v6.2.0
        with:
          python-version: "3.12"
          cache: "pip"
      - run: pip install -r backend/requirements.txt
      - run: python3 backend/build_blog.py
      - run: aws s3 sync frontend/public/ s3://${{ env.S3_BUCKET }}/ --delete
      - run: aws cloudfront create-invalidation --distribution-id ${{ env.CLOUDFRONT_DISTRIBUTION_ID }} --paths "/*"

Checkout, authenticate, build, sync, invalidate. About 60 seconds end to end.

OIDC: no stored credentials

The authentication step is the part worth understanding. The workflow authenticates to AWS using OIDC. GitHub generates a short-lived JWT at runtime, AWS exchanges it for temporary credentials. There are no long-lived IAM access keys stored anywhere. No secrets to rotate. No keys to leak.

Getting this working required three things:

An OIDC identity provider registered in IAM (token.actions.githubusercontent.com)
An IAM role with a trust policy scoped to the specific repo and branch pattern
The role ARN stored as a GitHub Actions secret

The trust policy condition is the piece that actually enforces the scope: repo:your-org/your-repo:ref:refs/heads/*. Only branches in that repo can assume the role. If you use a wildcard on the org or leave the repo condition off, you've opened this up to every repo in the account. That's a problem.

Least-privilege IAM

The role the workflow assumes has exactly what the deploy steps need and nothing else:

s3:PutObject, s3:DeleteObject, s3:ListBucket, s3:GetObject (scoped to the www bucket)
cloudfront:CreateInvalidation (scoped to the www distribution)

Because the policy references Terraform resource attributes (aws_s3_bucket.www.arn, aws_cloudfront_distribution.www.arn), it stays in sync automatically if those resources ever change.

Gotcha: Node.js version warnings

The standard action versions (actions/checkout@v4, etc.) target Node 20, which GitHub deprecated. I went through two attempts:

First I set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true as a workflow env var. Actions ran on Node 24, but the warning was still there.

The actual fix was upgrading to action versions that natively declare node24 as their runtime:
- actions/checkout@v6.0.2
- actions/setup-python@v6.2.0
- aws-actions/configure-aws-credentials@v6.1.0

Warning gone.

Migration 2: CloudFormation → Terraform

CloudFormation worked. I moved off it anyway. Two reasons.

Verbosity. A CloudFront distribution in CloudFormation YAML is hundreds of lines. The same resource in Terraform HCL is significantly shorter and actually readable. This matters when you're reviewing an infrastructure change. You want to understand what changed, not parse through YAML boilerplate.

Drift detection. CloudFormation doesn't tell you when a resource has been changed outside the stack. Terraform's plan output makes drift visible immediately. Run terraform plan and you'll see exactly what AWS has versus what your config says.

The practical argument is also there: Terraform is more widely used in the industry. Worth having a live example of it.

What's under management

Everything was imported, nothing was recreated. All resources stayed live throughout.

Four files:

frontend.tf: S3 buckets, public access blocks, bucket policies, website config, CloudFront OAC, both CloudFront distributions, Route53 A records
rum.tf: Cognito identity pool, IAM role and policy, CloudWatch RUM app monitor
iam.tf: GitHub Actions OIDC provider, IAM role, inline deploy policy
main.tf: backend config, provider

Two things intentionally left out: the ACM certificate and the Route53 hosted zone. Both pre-exist, and neither is safe to let Terraform recreate. They're referenced via data sources.

Remote state

Remote state lives in a dedicated S3 bucket in ca-central-1.

I used Terraform 1.10+ native S3 locking (use_lockfile = true). Worth calling out because older guides still tell you to create a DynamoDB table for locking. You no longer need that. See S3 native state locking in the Terraform docs.

The import process

The import loop was straightforward but tedious:

Write the resource block
Run terraform import <resource_address> <aws_resource_id>
Run terraform plan
If it shows changes, align the config with actual AWS state
Repeat until plan is clean

Order matters. You have to import dependencies before the resources that reference them.

The hard part was CloudFront. The distribution config has a lot of optional fields, and Terraform's defaults don't always match what AWS actually has set. I went through several rounds of plan, adjust, plan before it was clean. It wasn't broken, just annoying.

Cleanup after import

Once the plan was clean, I did a pass to make it readable:

Extracted CloudFront managed policy UUIDs into named locals (readable names instead of raw GUIDs in the config)
Replaced hardcoded ARNs with Terraform resource references
Used var.domain_name in the RUM config instead of a hardcoded string
Used the website_endpoint attribute for the apex bucket origin instead of a hardcoded regional endpoint

The imports get you to "working." The cleanup pass gets you to "maintainable."

What deploy looks like now

Merge to main. That's it.

GitHub Actions picks it up, authenticates via OIDC, builds the blog, syncs to S3, invalidates CloudFront. Site is live in about 60 seconds. No local tooling required. No credential setup. No playbooks to remember.

Infrastructure changes go through terraform plan locally, reviewed before terraform apply. Changes are visible, reviewable, and tracked.

One honest regression worth mentioning: with Ansible, aws login was enough and the session just worked. Terraform doesn't go through the AWS CLI credential chain the same way, so running it locally requires an extra step first:

eval $(aws configure export-credentials --format env)
terraform apply

An alias handles this in practice. But it's a real extra step that didn't exist before, and it's worth knowing going in.

The Ansible playbooks are still in the repo under infrastructure/playbooks/ for reference. But they're not part of any workflow anymore.

The before state wasn't broken. It just required me to babysit it. This doesn't.