Back to Software Engineering & Delivery Mastery Series GitHub Actions Bootcamp

Module 10: Real-World CI/CD Project

June 2, 2026 Wasil Zafar 35 min read

Build a production-grade CI/CD pipeline from scratch — automated testing with Playwright, intelligent caching, multi-environment deployments with approval gates, feature flags, semantic versioning with changesets, and AWS deployment via OIDC.

Table of Contents

  1. Project Introduction
  2. Automated Testing with Playwright
  3. Caching Test Dependencies
  4. Building & Testing on Pull Requests
  5. Version Management with Changesets
  6. Implementing Feature Flags
  7. Multi-Environment Deployments
  8. Deploying to AWS with OIDC
  9. PR Preview Environments
  10. Automatic Environment Cleanup
  11. Git Hooks for Pre-Commit Validation
  12. Exercises

Project Introduction: Full-Stack Application

Throughout Modules 1–9, we built up individual CI/CD capabilities. Now we bring everything together into a production-grade pipeline for a real full-stack application. This module isn't about learning new Actions syntax — it's about combining everything you know into a cohesive system that a team would actually ship with.

Our project is TaskFlow — a task management SaaS with a React frontend, Node.js/Express API, and PostgreSQL database. We'll build a pipeline that handles testing, versioning, multi-environment deployments, preview environments, and automatic cleanup.

Architecture Decision: We chose a monorepo structure with separate apps/web and apps/api directories. This allows us to use path filtering in workflows — changes to apps/web/** trigger frontend tests only, while apps/api/** triggers backend tests. This dramatically reduces CI time for focused PRs.
Complete CI/CD Pipeline Architecture
flowchart LR
    subgraph PR["Pull Request"]
        A[Push to Branch] --> B[Lint + Unit Tests]
        B --> C[Integration Tests]
        C --> D[E2E Tests - Playwright]
        D --> E[Deploy Preview Env]
        E --> F[Post Status Comment]
    end

    subgraph Merge["Merge to main"]
        G[Changeset Version] --> H[Build Artifacts]
        H --> I[Deploy to Staging]
        I --> J[Smoke Tests]
        J --> K{Approval Gate}
    end

    subgraph Prod["Production"]
        K -->|Approved| L[Deploy to Production]
        L --> M[Health Check]
        M --> N[Tag Release]
    end

    subgraph Cleanup["Cleanup"]
        O[PR Closed] --> P[Teardown Preview Env]
    end

    F -.->|PR Merged| G
            

Repository Structure

The monorepo uses npm workspaces for dependency management across packages:

taskflow/
├── .github/
│   ├── workflows/
│   │   ├── ci.yml                 # PR checks (lint, test, E2E)
│   │   ├── deploy-preview.yml     # Preview environment per PR
│   │   ├── deploy-staging.yml     # Staging deployment on merge
│   │   ├── deploy-production.yml  # Production with approval gate
│   │   ├── cleanup.yml            # Teardown preview envs on PR close
│   │   └── release.yml            # Changeset version + publish
│   └── actions/
│       └── setup-project/         # Composite action for common setup
│           └── action.yml
├── apps/
│   ├── web/                       # React frontend (Vite)
│   │   ├── src/
│   │   ├── e2e/                   # Playwright tests
│   │   ├── playwright.config.ts
│   │   └── package.json
│   └── api/                       # Node.js/Express API
│       ├── src/
│       ├── tests/
│       ├── Dockerfile
│       └── package.json
├── packages/
│   └── shared/                    # Shared types and utilities
│       └── package.json
├── docker-compose.yml             # Local dev (PostgreSQL, Redis)
├── package.json                   # Root workspace config
├── .changeset/
│   └── config.json                # Changeset configuration
└── .husky/
    ├── pre-commit                 # lint-staged
    └── pre-push                   # type-check

The deployment targets are:

  • Frontend: AWS S3 + CloudFront (static hosting with CDN)
  • API: AWS ECS Fargate (containerized, auto-scaling)
  • Database: AWS RDS PostgreSQL (managed)
  • Preview Environments: AWS ECS with dynamic task definitions

Setting Up Automated Testing (Including Playwright)

A production pipeline needs confidence at multiple testing levels. We implement the test pyramid: many fast unit tests at the base, fewer integration tests in the middle, and targeted E2E tests at the top.

Unit and Integration Tests

Unit tests run with Vitest (frontend) and Jest (API). Integration tests spin up a PostgreSQL container using Docker Compose services:

# .github/workflows/ci.yml
name: CI Pipeline

on:
  pull_request:
    branches: [main, develop]
  push:
    branches: [main]

concurrency:
  group: ci-${{ github.ref }}
  cancel-in-progress: true

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/actions/setup-project
      - run: npm run lint --workspaces

  unit-tests:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        workspace: [web, api, shared]
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/actions/setup-project
      - name: Run unit tests
        run: npm run test:unit --workspace=apps/${{ matrix.workspace }}
      - name: Upload coverage
        uses: actions/upload-artifact@v4
        with:
          name: coverage-${{ matrix.workspace }}
          path: apps/${{ matrix.workspace }}/coverage/

  integration-tests:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_USER: taskflow
          POSTGRES_PASSWORD: testpassword
          POSTGRES_DB: taskflow_test
        ports:
          - 5432:5432
        options: >-
          --health-cmd="pg_isready"
          --health-interval=10s
          --health-timeout=5s
          --health-retries=5
      redis:
        image: redis:7
        ports:
          - 6379:6379
        options: >-
          --health-cmd="redis-cli ping"
          --health-interval=10s
          --health-timeout=5s
          --health-retries=5
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/actions/setup-project
      - name: Run migrations
        run: npm run db:migrate --workspace=apps/api
        env:
          DATABASE_URL: postgresql://taskflow:testpassword@localhost:5432/taskflow_test
      - name: Run integration tests
        run: npm run test:integration --workspace=apps/api
        env:
          DATABASE_URL: postgresql://taskflow:testpassword@localhost:5432/taskflow_test
          REDIS_URL: redis://localhost:6379

End-to-End Testing with Playwright

Playwright tests verify complete user journeys. We run the full stack locally within the workflow, then test against it:

  e2e-tests:
    runs-on: ubuntu-latest
    needs: [lint, unit-tests]
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_USER: taskflow
          POSTGRES_PASSWORD: testpassword
          POSTGRES_DB: taskflow_test
        ports:
          - 5432:5432
        options: >-
          --health-cmd="pg_isready"
          --health-interval=10s
          --health-timeout=5s
          --health-retries=5
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/actions/setup-project

      - name: Install Playwright browsers
        run: npx playwright install --with-deps chromium

      - name: Build frontend
        run: npm run build --workspace=apps/web

      - name: Start API server
        run: npm run start &
        working-directory: apps/api
        env:
          DATABASE_URL: postgresql://taskflow:testpassword@localhost:5432/taskflow_test
          PORT: 3001

      - name: Start frontend preview
        run: npx vite preview --port 3000 &
        working-directory: apps/web

      - name: Wait for servers
        run: |
          npx wait-on http://localhost:3000 http://localhost:3001/health --timeout=30000

      - name: Run Playwright tests
        run: npx playwright test
        working-directory: apps/web
        env:
          BASE_URL: http://localhost:3000
          API_URL: http://localhost:3001

      - name: Upload test results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: apps/web/playwright-report/
          retention-days: 14

      - name: Upload traces on failure
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-traces
          path: apps/web/test-results/
          retention-days: 7
Playwright Tip: Always upload traces and screenshots on failure using if: failure(). Traces contain a full recording of the test — DOM snapshots, network requests, console logs — making debugging CI failures as easy as local ones. Open them with npx playwright show-trace trace.zip.

Caching Test Dependencies

Without caching, every workflow run downloads npm packages (~800MB), Playwright browsers (~300MB), and rebuilds Docker layers. Proper caching cuts CI time from 8+ minutes to under 3 minutes.

Composite Setup Action with Caching

We centralize all setup logic into a reusable composite action:

# .github/actions/setup-project/action.yml
name: Setup Project
description: Install Node.js, restore caches, install dependencies

inputs:
  node-version:
    description: Node.js version
    default: '20'
  install-playwright:
    description: Whether to install Playwright browsers
    default: 'false'

runs:
  using: composite
  steps:
    - name: Setup Node.js
      uses: actions/setup-node@v4
      with:
        node-version: ${{ inputs.node-version }}
        cache: 'npm'

    - name: Restore npm cache
      uses: actions/cache@v4
      with:
        path: ~/.npm
        key: npm-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
        restore-keys: |
          npm-${{ runner.os }}-

    - name: Install dependencies
      shell: bash
      run: npm ci

    - name: Cache Playwright browsers
      if: inputs.install-playwright == 'true'
      uses: actions/cache@v4
      id: playwright-cache
      with:
        path: ~/.cache/ms-playwright
        key: playwright-${{ runner.os }}-${{ hashFiles('apps/web/package-lock.json') }}

    - name: Install Playwright browsers
      if: inputs.install-playwright == 'true' && steps.playwright-cache.outputs.cache-hit != 'true'
      shell: bash
      run: npx playwright install --with-deps chromium

Docker Layer Caching for Integration Tests

When building Docker images in CI, layer caching prevents rebuilding unchanged layers:

    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v3

    - name: Build API image
      uses: docker/build-push-action@v5
      with:
        context: apps/api
        push: false
        load: true
        tags: taskflow-api:test
        cache-from: type=gha
        cache-to: type=gha,mode=max
Cache Impact: In our project, proper caching reduced average CI time from 8m 22s → 2m 47s (67% reduction). The biggest wins: npm cache saves ~45s, Playwright browser cache saves ~90s, Docker layer cache saves ~120s for image builds.

Building and Testing on Pull Requests

Pull requests are the quality gate. Every PR must pass all checks before merging. We configure branch protection rules requiring the CI workflow, and add a comment bot that posts test results directly on the PR.

Status Checks and Review Gates

Configure branch protection in repository settings:

  • Require status checks: lint, unit-tests, integration-tests, e2e-tests
  • Require branches up to date: Ensures tests run against latest main
  • Require review: At least 1 approval before merge
  • Dismiss stale reviews: New pushes invalidate approvals

Comment Bot for Test Reports

Post a summary comment on each PR with test results, coverage, and Playwright report links:

  report:
    runs-on: ubuntu-latest
    needs: [lint, unit-tests, integration-tests, e2e-tests]
    if: always() && github.event_name == 'pull_request'
    permissions:
      pull-requests: write
    steps:
      - name: Download coverage artifacts
        uses: actions/download-artifact@v4
        with:
          pattern: coverage-*
          merge-multiple: true

      - name: Generate report comment
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');

            // Read coverage summaries
            const webCoverage = JSON.parse(
              fs.readFileSync('coverage-web/coverage-summary.json', 'utf8')
            );
            const apiCoverage = JSON.parse(
              fs.readFileSync('coverage-api/coverage-summary.json', 'utf8')
            );

            const webPct = webCoverage.total.lines.pct;
            const apiPct = apiCoverage.total.lines.pct;

            const body = `## 📊 CI Report

            | Check | Status |
            |-------|--------|
            | Lint | ${{ needs.lint.result == 'success' && '✅ Passed' || '❌ Failed' }} |
            | Unit Tests | ${{ needs.unit-tests.result == 'success' && '✅ Passed' || '❌ Failed' }} |
            | Integration Tests | ${{ needs.integration-tests.result == 'success' && '✅ Passed' || '❌ Failed' }} |
            | E2E Tests | ${{ needs.e2e-tests.result == 'success' && '✅ Passed' || '❌ Failed' }} |

            ### Coverage
            - **Frontend:** ${webPct}%
            - **API:** ${apiPct}%

            ### Artifacts
            - [Playwright Report](${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId})
            `;

            // Find existing comment or create new
            const { data: comments } = await github.rest.issues.listComments({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
            });

            const botComment = comments.find(c => c.body.includes('## 📊 CI Report'));

            if (botComment) {
              await github.rest.issues.updateComment({
                owner: context.repo.owner,
                repo: context.repo.repo,
                comment_id: botComment.id,
                body,
              });
            } else {
              await github.rest.issues.createComment({
                owner: context.repo.owner,
                repo: context.repo.repo,
                issue_number: context.issue.number,
                body,
              });
            }

Version Management with Changesets

Changesets automates semantic versioning by collecting version intent from developers at PR time, then batching those into release PRs. Instead of manually updating package.json versions and writing changelogs, developers add a changeset file describing their change.

# Developer adds a changeset during PR creation
npx changeset
# ? What packages have changed? → apps/api
# ? What type of change? → minor (new feature)
# ? Describe the change → Added task filtering by priority level

This creates a file like .changeset/fuzzy-lions-dance.md:

---
"@taskflow/api": minor
---

Added task filtering by priority level

Automated Release Workflow

# .github/workflows/release.yml
name: Release

on:
  push:
    branches: [main]

permissions:
  contents: write
  pull-requests: write

jobs:
  release:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: ./.github/actions/setup-project

      - name: Create Release PR or Publish
        id: changesets
        uses: changesets/action@v1
        with:
          title: 'chore: version packages'
          commit: 'chore: version packages'
          publish: npm run release
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          NPM_TOKEN: ${{ secrets.NPM_TOKEN }}

      - name: Tag release
        if: steps.changesets.outputs.published == 'true'
        run: |
          VERSION=$(node -p "require('./apps/api/package.json').version")
          git tag "v${VERSION}"
          git push origin "v${VERSION}"
How Changesets Work: When PRs with changeset files merge to main, the release workflow opens a "Version Packages" PR that bumps versions and updates changelogs. When that PR merges, it triggers the actual publish step. This two-step flow gives you a final review before any release goes live.

Implementing Feature Flags

Feature flags decouple deployment from release. You can deploy code to production without exposing it to users, then enable features gradually. We integrate flags into our workflow using environment-specific configurations.

# In deploy-staging.yml — enable experimental features in staging
  deploy-staging:
    runs-on: ubuntu-latest
    environment: staging
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/actions/setup-project

      - name: Build with staging flags
        run: npm run build --workspace=apps/web
        env:
          VITE_FEATURE_TASK_PRIORITIES: 'true'
          VITE_FEATURE_AI_SUGGESTIONS: 'true'
          VITE_FEATURE_DARK_MODE: 'true'
          VITE_API_URL: ${{ vars.STAGING_API_URL }}

      - name: Update feature flag service
        uses: actions/github-script@v7
        with:
          script: |
            const response = await fetch('${{ vars.UNLEASH_URL }}/api/admin/projects/default/features', {
              method: 'GET',
              headers: {
                'Authorization': '${{ secrets.UNLEASH_API_TOKEN }}',
                'Content-Type': 'application/json'
              }
            });
            const features = await response.json();
            core.info(`Active flags in staging: ${features.features.length}`);

Conditional Workflow Steps Based on Flags

      - name: Run AI feature tests
        if: vars.FEATURE_AI_SUGGESTIONS == 'true'
        run: npm run test:ai-features --workspace=apps/web
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

      - name: Skip AI tests (flag disabled)
        if: vars.FEATURE_AI_SUGGESTIONS != 'true'
        run: echo "AI suggestion tests skipped — feature flag disabled in this environment"

Multi-Environment Deployments (Staging & Production)

Production deployments require multiple safety layers: staging verification, approval gates, and rollback capability. We use GitHub Environments to enforce these policies.

Production Safety: Never deploy directly to production without a staging verification step. Our pipeline requires: (1) all CI checks pass, (2) staging deployment succeeds, (3) staging smoke tests pass, (4) manual approval from at least one team lead. Skip any of these steps in a real project and you will ship broken code to users.
# .github/workflows/deploy-staging.yml
name: Deploy to Staging

on:
  push:
    branches: [main]

jobs:
  deploy-staging:
    runs-on: ubuntu-latest
    environment: staging
    outputs:
      version: ${{ steps.version.outputs.value }}
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/actions/setup-project

      - name: Get version
        id: version
        run: echo "value=$(node -p "require('./apps/api/package.json').version")" >> $GITHUB_OUTPUT

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ vars.AWS_ROLE_ARN_STAGING }}
          aws-region: us-east-1

      - name: Build and push API image
        run: |
          aws ecr get-login-password | docker login --username AWS --password-stdin ${{ vars.ECR_REGISTRY }}
          docker build -t ${{ vars.ECR_REGISTRY }}/taskflow-api:${{ steps.version.outputs.value }} apps/api
          docker push ${{ vars.ECR_REGISTRY }}/taskflow-api:${{ steps.version.outputs.value }}

      - name: Deploy API to ECS
        run: |
          aws ecs update-service \
            --cluster taskflow-staging \
            --service taskflow-api \
            --force-new-deployment \
            --task-definition taskflow-api-staging

      - name: Wait for deployment stability
        run: |
          aws ecs wait services-stable \
            --cluster taskflow-staging \
            --services taskflow-api

      - name: Build and deploy frontend
        run: |
          npm run build --workspace=apps/web
          aws s3 sync apps/web/dist/ s3://${{ vars.S3_BUCKET_STAGING }}/ --delete
          aws cloudfront create-invalidation --distribution-id ${{ vars.CF_DIST_STAGING }} --paths "/*"
        env:
          VITE_API_URL: ${{ vars.STAGING_API_URL }}

  smoke-tests:
    runs-on: ubuntu-latest
    needs: deploy-staging
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/actions/setup-project
        with:
          install-playwright: 'true'

      - name: Run smoke tests against staging
        run: npx playwright test --project=smoke
        working-directory: apps/web
        env:
          BASE_URL: ${{ vars.STAGING_URL }}
          API_URL: ${{ vars.STAGING_API_URL }}

Rollback Strategy

If production deployment fails health checks, we automatically roll back to the previous task definition:

      - name: Health check
        id: health
        continue-on-error: true
        run: |
          for i in {1..10}; do
            STATUS=$(curl -s -o /dev/null -w "%{http_code}" ${{ vars.PRODUCTION_API_URL }}/health)
            if [ "$STATUS" = "200" ]; then
              echo "Health check passed"
              exit 0
            fi
            echo "Attempt $i: status $STATUS, retrying in 10s..."
            sleep 10
          done
          echo "Health check failed after 10 attempts"
          exit 1

      - name: Rollback on failure
        if: steps.health.outcome == 'failure'
        run: |
          echo "🚨 Rolling back to previous deployment..."
          PREVIOUS_TASK=$(aws ecs describe-services \
            --cluster taskflow-production \
            --services taskflow-api \
            --query 'services[0].deployments[1].taskDefinition' \
            --output text)
          aws ecs update-service \
            --cluster taskflow-production \
            --service taskflow-api \
            --task-definition $PREVIOUS_TASK
          aws ecs wait services-stable \
            --cluster taskflow-production \
            --services taskflow-api
          echo "✅ Rollback complete"
          exit 1  # Fail the workflow to alert team

Deploying to AWS with OIDC

OIDC (OpenID Connect) eliminates long-lived AWS credentials in your repository. Instead of storing AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY as secrets, GitHub's OIDC provider issues short-lived tokens that AWS trusts directly.

Security Requirement: Never use static AWS credentials (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY) in GitHub Actions for production workloads. Static credentials can be leaked, never expire, and grant broad access. OIDC tokens are scoped to specific repositories, branches, and environments — and expire in minutes.

Setting Up OIDC Trust in AWS

First, create the IAM OIDC identity provider and trust policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
        },
        "StringLike": {
          "token.actions.githubusercontent.com:sub": "repo:your-org/taskflow:environment:production"
        }
      }
    }
  ]
}

The sub condition ensures only workflows running in the production environment of your specific repository can assume this role. Different environments use different roles with different permission boundaries:

# Multi-account OIDC setup
# Staging: arn:aws:iam::111111111111:role/github-actions-staging
# Production: arn:aws:iam::222222222222:role/github-actions-production

      - name: Configure AWS credentials (Production)
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::222222222222:role/github-actions-production
          role-session-name: github-actions-deploy-${{ github.run_id }}
          aws-region: us-east-1

Health Check Verification

      - name: Verify ECS deployment health
        run: |
          echo "Waiting for ECS service to stabilize..."
          aws ecs wait services-stable \
            --cluster taskflow-production \
            --services taskflow-api \
            --timeout 300

          # Verify running task count
          RUNNING=$(aws ecs describe-services \
            --cluster taskflow-production \
            --services taskflow-api \
            --query 'services[0].runningCount' \
            --output text)

          DESIRED=$(aws ecs describe-services \
            --cluster taskflow-production \
            --services taskflow-api \
            --query 'services[0].desiredCount' \
            --output text)

          if [ "$RUNNING" != "$DESIRED" ]; then
            echo "❌ Running tasks ($RUNNING) != Desired ($DESIRED)"
            exit 1
          fi
          echo "✅ All $RUNNING tasks healthy"

Deploying PR Branches to Isolated Environments

Preview environments give reviewers a live, isolated instance of every PR. Each PR gets its own URL (e.g., pr-42.preview.taskflow.dev) with its own API and database — completely independent from staging or other PRs.

# .github/workflows/deploy-preview.yml
name: Deploy Preview Environment

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read
  pull-requests: write
  id-token: write

jobs:
  deploy-preview:
    runs-on: ubuntu-latest
    environment:
      name: preview-pr-${{ github.event.pull_request.number }}
      url: https://pr-${{ github.event.pull_request.number }}.preview.taskflow.dev
    steps:
      - uses: actions/checkout@v4
      - uses: ./.github/actions/setup-project

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ vars.AWS_ROLE_ARN_PREVIEW }}
          aws-region: us-east-1

      - name: Set environment name
        id: env
        run: |
          echo "name=pr-${{ github.event.pull_request.number }}" >> $GITHUB_OUTPUT
          echo "domain=pr-${{ github.event.pull_request.number }}.preview.taskflow.dev" >> $GITHUB_OUTPUT

      - name: Create preview database
        run: |
          aws rds create-db-instance \
            --db-instance-identifier taskflow-${{ steps.env.outputs.name }} \
            --db-instance-class db.t3.micro \
            --engine postgres \
            --master-username taskflow \
            --master-user-password ${{ secrets.PREVIEW_DB_PASSWORD }} \
            --allocated-storage 20 \
            --tags Key=Environment,Value=preview Key=PR,Value=${{ github.event.pull_request.number }} \
            2>/dev/null || echo "Database already exists"

      - name: Build and push preview image
        run: |
          aws ecr get-login-password | docker login --username AWS --password-stdin ${{ vars.ECR_REGISTRY }}
          docker build \
            -t ${{ vars.ECR_REGISTRY }}/taskflow-api:${{ steps.env.outputs.name }} \
            apps/api
          docker push ${{ vars.ECR_REGISTRY }}/taskflow-api:${{ steps.env.outputs.name }}

      - name: Deploy preview to ECS
        run: |
          # Register task definition with PR-specific config
          TASK_DEF=$(cat apps/api/task-definition.json | \
            jq '.containerDefinitions[0].image = "${{ vars.ECR_REGISTRY }}/taskflow-api:${{ steps.env.outputs.name }}"' | \
            jq '.family = "taskflow-api-${{ steps.env.outputs.name }}"')
          echo "$TASK_DEF" > /tmp/task-def.json
          aws ecs register-task-definition --cli-input-json file:///tmp/task-def.json

          # Create or update service
          aws ecs create-service \
            --cluster taskflow-preview \
            --service-name ${{ steps.env.outputs.name }} \
            --task-definition taskflow-api-${{ steps.env.outputs.name }} \
            --desired-count 1 \
            --launch-type FARGATE \
            --network-configuration "awsvpcConfiguration={subnets=[${{ vars.PREVIEW_SUBNET }}],securityGroups=[${{ vars.PREVIEW_SG }}],assignPublicIp=ENABLED}" \
            2>/dev/null || \
          aws ecs update-service \
            --cluster taskflow-preview \
            --service ${{ steps.env.outputs.name }} \
            --task-definition taskflow-api-${{ steps.env.outputs.name }} \
            --force-new-deployment

      - name: Deploy frontend to S3
        run: |
          npm run build --workspace=apps/web
          aws s3 sync apps/web/dist/ s3://taskflow-preview/${{ steps.env.outputs.name }}/ --delete
        env:
          VITE_API_URL: https://api-${{ steps.env.outputs.name }}.preview.taskflow.dev

      - name: Comment preview URL
        uses: actions/github-script@v7
        with:
          script: |
            const body = `## 🚀 Preview Environment Ready

            | Resource | URL |
            |----------|-----|
            | Frontend | https://${{ steps.env.outputs.domain }} |
            | API | https://api-${{ steps.env.outputs.name }}.preview.taskflow.dev |
            | Health | https://api-${{ steps.env.outputs.name }}.preview.taskflow.dev/health |

            > Environment will be automatically destroyed when this PR is closed.
            `;

            const { data: comments } = await github.rest.issues.listComments({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
            });
            const existing = comments.find(c => c.body.includes('## 🚀 Preview Environment Ready'));
            if (existing) {
              await github.rest.issues.updateComment({
                owner: context.repo.owner,
                repo: context.repo.repo,
                comment_id: existing.id,
                body,
              });
            } else {
              await github.rest.issues.createComment({
                owner: context.repo.owner,
                repo: context.repo.repo,
                issue_number: context.issue.number,
                body,
              });
            }

Automatic Environment Cleanup

Preview environments cost money. Every open PR has an ECS service, database, and S3 bucket consuming resources. We must tear them down when the PR is closed or merged.

# .github/workflows/cleanup.yml
name: Cleanup Preview Environment

on:
  pull_request:
    types: [closed]

permissions:
  id-token: write
  contents: read
  deployments: write

jobs:
  cleanup:
    runs-on: ubuntu-latest
    steps:
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ vars.AWS_ROLE_ARN_PREVIEW }}
          aws-region: us-east-1

      - name: Set environment name
        id: env
        run: echo "name=pr-${{ github.event.pull_request.number }}" >> $GITHUB_OUTPUT

      - name: Delete ECS service
        run: |
          aws ecs update-service \
            --cluster taskflow-preview \
            --service ${{ steps.env.outputs.name }} \
            --desired-count 0 \
            2>/dev/null || true
          aws ecs delete-service \
            --cluster taskflow-preview \
            --service ${{ steps.env.outputs.name }} \
            --force \
            2>/dev/null || true
          echo "✅ ECS service deleted"

      - name: Delete preview database
        run: |
          aws rds delete-db-instance \
            --db-instance-identifier taskflow-${{ steps.env.outputs.name }} \
            --skip-final-snapshot \
            2>/dev/null || true
          echo "✅ Database deletion initiated"

      - name: Clean S3 preview files
        run: |
          aws s3 rm s3://taskflow-preview/${{ steps.env.outputs.name }}/ --recursive
          echo "✅ S3 files cleaned"

      - name: Delete ECR images
        run: |
          aws ecr batch-delete-image \
            --repository-name taskflow-api \
            --image-ids imageTag=${{ steps.env.outputs.name }} \
            2>/dev/null || true
          echo "✅ ECR image deleted"

      - name: Deactivate GitHub environment
        uses: actions/github-script@v7
        with:
          script: |
            // List deployments for this environment
            const envName = `preview-${{ steps.env.outputs.name }}`;
            const { data: deployments } = await github.rest.repos.listDeployments({
              owner: context.repo.owner,
              repo: context.repo.repo,
              environment: envName,
            });

            // Mark all deployments as inactive
            for (const deployment of deployments) {
              await github.rest.repos.createDeploymentStatus({
                owner: context.repo.owner,
                repo: context.repo.repo,
                deployment_id: deployment.id,
                state: 'inactive',
              });
            }
            core.info(`✅ Deactivated ${deployments.length} deployments for ${envName}`);
Cost Management: Without cleanup automation, preview environments accumulate. A team with 10 open PRs at any time could spend $500+/month on idle preview resources. The cleanup workflow runs in ~30 seconds and saves significant costs. Additionally, consider adding a scheduled workflow that runs nightly to catch any orphaned resources where the closed event was missed.

Using Git Hooks for Pre-Commit Validation

Git hooks catch issues before code reaches CI, reducing feedback loops from minutes to seconds. We use Husky for hook management and lint-staged to run checks only on changed files.

{
  "name": "taskflow",
  "scripts": {
    "prepare": "husky"
  },
  "lint-staged": {
    "*.{ts,tsx}": [
      "eslint --fix",
      "prettier --write"
    ],
    "*.{json,md,yml,yaml}": [
      "prettier --write"
    ]
  }
}

Hook Configuration

# .husky/pre-commit
npx lint-staged
# .husky/pre-push
npm run typecheck
npm run test:unit -- --changed

The pre-push hook runs TypeScript type checking and unit tests for changed files only — catching type errors and test regressions before they consume CI minutes.

CI/Local Parity

Ensure local and CI environments behave identically:

# In ci.yml — verify hooks are properly configured
      - name: Verify git hooks
        run: |
          # Ensure husky is installed and hooks match expected content
          if [ ! -f .husky/pre-commit ]; then
            echo "❌ pre-commit hook missing — run 'npm run prepare'"
            exit 1
          fi
          if ! grep -q "lint-staged" .husky/pre-commit; then
            echo "❌ pre-commit hook doesn't run lint-staged"
            exit 1
          fi
          echo "✅ Git hooks properly configured"
Local ↔ CI Parity: The ESLint and Prettier configurations used in lint-staged (local) must be identical to those in the CI lint job. If CI catches formatting errors that local hooks should have caught, your hooks are misconfigured. Use the same .eslintrc and .prettierrc files for both — never have separate CI-specific configs.

Exercises

Exercise 1: Complete CI Pipeline

Goal: Set up the full CI workflow for a monorepo application.

Tasks:

  1. Create a monorepo with apps/web (React + Vite) and apps/api (Express) directories
  2. Write a CI workflow with separate jobs for linting, unit tests, integration tests (using service containers), and E2E tests (Playwright)
  3. Add a composite action for project setup with npm and Playwright browser caching
  4. Configure the workflow to cancel previous runs on the same branch using concurrency

Success Criteria: The CI workflow runs all test types in parallel where possible, caches dependencies effectively, and completes in under 4 minutes.

CI/CD Playwright Caching Monorepo
Exercise 2: Changeset-Based Releases

Goal: Implement automated versioning and changelog generation using Changesets.

Tasks:

  1. Install and configure @changesets/cli for your monorepo
  2. Create a release workflow that uses changesets/action to generate version PRs
  3. Add a CI check that fails if a PR modifies source code but doesn't include a changeset file
  4. Configure the workflow to tag releases and create GitHub Releases with generated changelogs

Success Criteria: Merging PRs with changesets automatically opens a "Version Packages" PR. Merging that PR publishes the release and creates a git tag.

Changesets Versioning Release
Exercise 3: Preview Environments with Cleanup

Goal: Deploy isolated preview environments per PR and automatically destroy them on PR close.

Tasks:

  1. Create a deploy-preview.yml workflow triggered on PR open/synchronize
  2. Deploy each PR to a unique URL using the PR number as an identifier
  3. Post a comment on the PR with the preview URL
  4. Create a cleanup.yml workflow triggered on PR close that tears down all preview resources
  5. Add a scheduled nightly workflow that removes any orphaned preview environments older than 7 days

Success Criteria: Opening a PR creates an accessible preview. Closing the PR removes all associated resources within 2 minutes.

Preview Environments Cleanup Cost Management
Exercise 4: Production Deployment with OIDC and Rollback

Goal: Implement a production deployment pipeline using AWS OIDC with automatic rollback on failure.

Tasks:

  1. Set up an OIDC identity provider in AWS and create IAM roles for staging and production (separate AWS accounts)
  2. Write a staging deployment workflow with smoke tests that run after deploy
  3. Write a production workflow with a manual approval gate (GitHub Environment protection rules)
  4. Implement health check verification that automatically rolls back to the previous ECS task definition if the new deployment is unhealthy
  5. Add Slack notification on deployment success or rollback

Success Criteria: A failing health check triggers automatic rollback within 90 seconds. OIDC credentials are scoped per-environment and expire after the workflow completes.

AWS OIDC Rollback Production

Next in the Series

In Module 11: Bonus Topics & Advanced Patterns, we'll cover GitHub Actions at enterprise scale — large runner fleets, Actions Runner Controller on Kubernetes, GitHub Apps for authentication, workflow visualization, compliance automation, and emerging patterns for AI-assisted pipelines.