Back to Software Engineering & Delivery Mastery Series Azure DevOps Bootcamp

Module 7: Multi-Stage Pipelines & Deployment

June 3, 2026 Wasil Zafar 44 min read

Master Azure Pipelines multi-stage deployments — environments with approvals and gates, deployment strategies (rolling, blue-green, canary, ring), service connections, deployment jobs vs regular jobs, and end-to-end release orchestration from build to production.

Table of Contents

  1. Foundations
  2. Deployment Jobs
  3. Advanced Strategies
  4. Practice

From Build to Production

In Modules 4–6, you mastered building code, running tests, and encapsulating pipeline logic into reusable templates. But here's the question those modules didn't answer: how does that tested artifact actually reach your users?

The journey from code to production is a series of gates:

  1. Build — Compile, package, produce an artifact
  2. Test — Unit tests, integration tests, security scans
  3. Deploy (Staging) — Deploy to a non-production environment
  4. Approve — Human or automated verification
  5. Deploy (Production) — Ship to real users

A multi-stage pipeline encodes this entire journey in a single YAML file. Every stage, approval, and deployment strategy is version-controlled alongside your application code — full traceability from commit to production.

Analogy: A multi-stage pipeline is like an airport journey — check-in (build), security screening (test), boarding gate (approval), takeoff (staging), and landing (production). Each gate must be passed before proceeding. You can't board without a boarding pass, and you can't take off without clearance from the tower.

Why Multi-Stage Matters

Benefit Explanation
Single file, full picture Build + deploy in one YAML — no disconnected "release pipelines"
Traceability Every production deployment links to the exact commit and test run
Gated promotions Approvals and automated gates prevent bad code from reaching users
Rollback clarity Re-run a previous successful pipeline to restore a known-good state
Environment parity Same deployment steps for dev, staging, and production — no "works on staging" surprises

For deployment strategy concepts (the theory behind rolling, blue-green, and canary), see Part 16: Deployment Strategies. This module focuses on implementing those strategies in Azure Pipelines YAML.

Environments

An environment in Azure DevOps is a named deployment target — a logical grouping that represents where your application runs. Think of environments as "landing pads" that can have policies, approvals, and deployment history attached to them.

Creating Environments

Environments are created in Pipelines → Environments in the Azure DevOps portal (or automatically the first time a pipeline references one). Common naming:

  • dev — Development (auto-deploy on every commit)
  • staging — Pre-production (mirrors production config)
  • production — Live users (requires approval)

Environment Resource Types

Resource Type What It Targets Use Case
Kubernetes AKS or any K8s cluster Container deployments with canary/rolling
Virtual Machine VMs registered as resources Rolling deployments to VM farms
None (generic) Logical target only App Service, Functions, or any cloud PaaS

Referencing Environments in YAML

# Referencing an environment in a deployment job
# The environment name MUST exist (or will be auto-created)

stages:
  - stage: DeployStaging
    displayName: 'Deploy → Staging'
    jobs:
      - deployment: deployApp
        displayName: 'Deploy to Staging'
        pool:
          vmImage: 'ubuntu-latest'
        # This is the key line — connects the job to an environment
        environment: 'staging'
        strategy:
          runOnce:
            deploy:
              steps:
                - script: echo "Deploying to staging environment"

  - stage: DeployProduction
    displayName: 'Deploy → Production'
    dependsOn: DeployStaging
    jobs:
      - deployment: deployApp
        displayName: 'Deploy to Production'
        pool:
          vmImage: 'ubuntu-latest'
        # Production environment — approvals configured in the portal
        environment: 'production'
        strategy:
          runOnce:
            deploy:
              steps:
                - script: echo "Deploying to production environment"

Every deployment to an environment is recorded in its deployment history — you can see which pipeline run deployed which commit, when, and whether it succeeded. This is invaluable for auditing and incident response.

Approvals & Gates

Environments become powerful when you attach checks to them. These checks must pass before a deployment job targeting that environment can proceed.

Manual Approvals

Configure in the portal: Environment → ⋮ → Approvals and checks → Add check → Approvals. Key settings:

  • Approvers: One or more users/groups (e.g., "Release Managers")
  • Minimum approvals: Require 2-of-3 for consensus decisions
  • Timeout: How long to wait before auto-rejecting (default: 30 days)
  • Allow self-approval: Whether the person who triggered the pipeline can approve

Automated Gates

Gates are automated checks that run periodically until they pass (or timeout). They're perfect for "zero-touch" quality gates:

Gate Type What It Checks Example
Invoke REST API Custom HTTP endpoint returns success Call your health-check API, verify 200 OK
Azure Monitor alerts No active alerts in a resource Zero P1/P2 alerts in the staging App Service
Query work items Work item query returns expected results No open "deployment-blocker" bugs
Required template Pipeline uses a specific YAML template Ensures governance templates are applied
Business hours Current time is within allowed window Only deploy Mon–Fri, 9am–4pm
Gate evaluation: Gates run periodically (default: every 5 minutes) until they pass or timeout. Use them for automatic validation like "zero P1 alerts in Azure Monitor" before promoting to production. The re-evaluation interval and timeout are configurable per check.

Exclusive Locks

An exclusive lock check ensures only one pipeline run deploys to an environment at a time. If Pipeline Run #42 is deploying to production, Run #43 must wait in a queue. This prevents conflicting deployments from corrupting state.

Approval & Gate Flow
flowchart TD
    A[Staging deployment succeeds] --> B[Pipeline requests Production environment]
    B --> C{Exclusive lock available?}
    C -->|No| D[⏳ Queue — wait for lock]
    C -->|Yes| E{Automated gates pass?}
    D --> C
    E -->|No| F[⏳ Re-evaluate in 5 min]
    F --> E
    E -->|Yes| G{Manual approval granted?}
    G -->|Rejected| H[❌ Deployment blocked]
    G -->|Timeout| H
    G -->|Approved| I[✅ Deployment proceeds]
    I --> J[Production deployment runs]
                            

Service Connections

A service connection is an authenticated link between Azure DevOps and an external service. It stores credentials securely and makes them available to pipeline tasks without exposing secrets in YAML.

Common Service Connection Types

Type Connects To Used For
Azure Resource Manager Azure subscription Deploying to App Service, AKS, Functions, VMs
Docker Registry ACR, Docker Hub, etc. Pushing/pulling container images
Kubernetes Any K8s cluster kubectl deployments, Helm releases
npm / NuGet Package registries Publishing packages
SSH Remote servers SCP file transfers, remote script execution

ARM Connection Authentication Methods

  • Service Principal (secret) — App registration + client secret (expires, needs rotation)
  • Service Principal (certificate) — More secure than secrets, still requires rotation
  • Managed Identity — For self-hosted agents running on Azure VMs (no credentials stored)
  • Workload Identity Federation (OIDC)The modern approach: Azure trusts the pipeline's identity token directly — no secrets at all
Security recommendation: Prefer Workload Identity Federation over service principal secrets — it eliminates credential rotation, reduces secret exposure risk, and is the recommended approach for Azure deployments. With WIF, Azure DevOps presents an OIDC token, and Azure AD validates it without any stored credential.

Using Service Connections in Tasks

# Service connections are referenced by NAME in task inputs
# They're configured in Project Settings → Service connections

steps:
  # Deploy to Azure App Service using an ARM service connection
  - task: AzureWebApp@1
    displayName: 'Deploy to App Service'
    inputs:
      # This name must match the service connection configured in the portal
      azureSubscription: 'Production-Azure-Connection'
      appType: 'webAppLinux'
      appName: 'myapp-production'
      package: '$(Pipeline.Workspace)/drop/**/*.zip'

  # Push a container image using a Docker Registry connection
  - task: Docker@2
    displayName: 'Push image to ACR'
    inputs:
      containerRegistry: 'MyACR-Connection'
      repository: 'myapp'
      command: 'push'
      tags: '$(Build.BuildId)'

Scoping Service Connections

By default, any pipeline in the project can use any service connection. For production environments, restrict access: Service connection → ⋮ → Security → Pipeline permissions. Grant access only to specific pipelines to prevent unauthorized deployments.

Deployment Jobs

A deployment job (deployment:) is different from a regular job (job:) in several important ways:

Feature Regular Job (job:) Deployment Job (deployment:)
Environment targeting ❌ Not supported ✅ Links to an environment
Deployment strategies ❌ Not available ✅ runOnce, rolling, canary
Lifecycle hooks ❌ Just steps ✅ preDeploy, deploy, routeTraffic, etc.
Deployment history ❌ Not tracked ✅ Recorded per environment
Approval checks ❌ Not applicable ✅ Waits for environment approvals

Lifecycle Hooks

Deployment strategies expose lifecycle hooks — named phases where you insert custom steps:

  • preDeploy — Run before deployment starts (e.g., send Slack notification)
  • deploy — The actual deployment steps
  • routeTraffic — Shift traffic to the new version (canary/blue-green)
  • postRouteTraffic — Run after traffic is routed (e.g., smoke tests)
  • on.failure — Runs only if deployment fails (e.g., rollback, alert)
  • on.success — Runs only if deployment succeeds (e.g., tag release)

runOnce Strategy

# The simplest deployment strategy — deploy once and done
# Good for: App Service, Functions, static sites, database migrations

stages:
  - stage: DeployStaging
    jobs:
      - deployment: deployWebApp
        displayName: 'Deploy Web App to Staging'
        pool:
          vmImage: 'ubuntu-latest'
        environment: 'staging'
        strategy:
          runOnce:
            preDeploy:
              steps:
                - script: echo "📦 Starting deployment to staging..."
                  displayName: 'Pre-deploy notification'
            deploy:
              steps:
                - download: current
                  artifact: 'webapp'
                - task: AzureWebApp@1
                  displayName: 'Deploy to App Service'
                  inputs:
                    azureSubscription: 'Azure-Staging'
                    appName: 'myapp-staging'
                    package: '$(Pipeline.Workspace)/webapp/**/*.zip'
            postRouteTraffic:
              steps:
                - script: |
                    # Run smoke tests against the deployed app
                    curl -f https://myapp-staging.azurewebsites.net/health
                  displayName: 'Smoke test'
            on:
              failure:
                steps:
                  - script: echo "❌ Deployment failed! Alerting team..."
                    displayName: 'Failure notification'
              success:
                steps:
                  - script: echo "✅ Staging deployment successful!"
                    displayName: 'Success notification'

Rolling Deployments

A rolling deployment updates instances incrementally rather than all at once. If you have 10 VMs, a rolling deployment with maxParallel: 2 updates 2 VMs at a time while the other 8 continue serving traffic. If a batch fails, the remaining VMs stay on the old version.

When to Use Rolling

  • Deploying to a pool of VMs or containers
  • You need zero-downtime but don't need instant rollback
  • Your application can handle mixed versions briefly
# Rolling deployment to a VM environment
# Requires VMs registered as resources in the 'production-vms' environment
# Each VM runs the deployment steps sequentially in batches

stages:
  - stage: RollingDeploy
    displayName: 'Rolling Deploy to Production VMs'
    jobs:
      - deployment: rollout
        displayName: 'Rolling update'
        pool:
          vmImage: 'ubuntu-latest'
        environment:
          name: 'production-vms'
          resourceType: VirtualMachine
        strategy:
          rolling:
            # Deploy to 2 VMs at a time (out of 10 total)
            maxParallel: 2
            preDeploy:
              steps:
                - script: echo "Preparing VM $(Agent.MachineName) for update"
                  displayName: 'Drain traffic from this VM'
            deploy:
              steps:
                - script: |
                    # Stop the application service
                    sudo systemctl stop myapp
                    # Deploy new version
                    sudo cp -r $(Pipeline.Workspace)/drop/* /opt/myapp/
                    # Start the updated service
                    sudo systemctl start myapp
                  displayName: 'Deploy new version'
            routeTraffic:
              steps:
                - script: echo "Re-enabling traffic to $(Agent.MachineName)"
                  displayName: 'Restore traffic'
            postRouteTraffic:
              steps:
                - script: |
                    # Verify the VM is healthy before moving to the next batch
                    curl -f http://localhost:8080/health || exit 1
                  displayName: 'Health check'
            on:
              failure:
                steps:
                  - script: echo "❌ Rolling update failed on $(Agent.MachineName)"
                    displayName: 'Alert on failure'

The key insight: if the health check in postRouteTraffic fails on any batch, the rolling deployment stops. The remaining VMs keep running the old version, limiting blast radius.

Blue-Green Deployments

A blue-green deployment maintains two identical environments — "blue" (current production) and "green" (new version). You deploy to green, validate it, then instantly swap traffic. If something goes wrong, swap back in seconds.

Azure App Service Deployment Slots

Azure App Service deployment slots are the perfect implementation of blue-green. Every App Service has a "production" slot (blue) by default. You create a "staging" slot (green), deploy there, validate, and swap:

Blue-Green Swap with App Service Slots
flowchart LR
    subgraph Before Swap
        A[Users] -->|Traffic| B[Production Slot - v1.0 BLUE]
        C[Staging Slot - v2.0 GREEN] -.->|Test traffic| D[QA Team]
    end

    subgraph After Swap
        E[Users] -->|Traffic| F[Production Slot - v2.0 GREEN]
        G[Staging Slot - v1.0 BLUE] -.->|Rollback ready| H[Instant swap back]
    end
                            
# Blue-green deployment using Azure App Service deployment slots
# The staging slot receives the new version, then we swap to production

stages:
  - stage: Build
    jobs:
      - job: BuildApp
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - task: DotNetCoreCLI@2
            inputs:
              command: 'publish'
              publishWebProjects: true
              arguments: '--configuration Release --output $(Build.ArtifactStagingDirectory)'
          - publish: $(Build.ArtifactStagingDirectory)
            artifact: 'drop'

  - stage: DeployGreen
    displayName: 'Deploy to Staging Slot (Green)'
    dependsOn: Build
    jobs:
      - deployment: deployGreen
        pool:
          vmImage: 'ubuntu-latest'
        environment: 'production'
        strategy:
          runOnce:
            deploy:
              steps:
                - download: current
                  artifact: 'drop'
                # Deploy to the STAGING slot (green) — not production yet
                - task: AzureWebApp@1
                  displayName: 'Deploy to staging slot'
                  inputs:
                    azureSubscription: 'Production-Azure'
                    appName: 'myapp-prod'
                    deployToSlotOrASE: true
                    slotName: 'staging'
                    package: '$(Pipeline.Workspace)/drop/**/*.zip'
                # Validate the staging slot is healthy
                - script: |
                    curl -f https://myapp-prod-staging.azurewebsites.net/health
                  displayName: 'Validate staging slot'

  - stage: SwapToProduction
    displayName: 'Swap Slots (Blue ↔ Green)'
    dependsOn: DeployGreen
    jobs:
      - deployment: swapSlots
        pool:
          vmImage: 'ubuntu-latest'
        environment: 'production'
        strategy:
          runOnce:
            deploy:
              steps:
                # Instant swap — staging becomes production, production becomes staging
                - task: AzureAppServiceManage@0
                  displayName: 'Swap staging ↔ production'
                  inputs:
                    azureSubscription: 'Production-Azure'
                    action: 'Swap Slots'
                    webAppName: 'myapp-prod'
                    sourceSlot: 'staging'
                    targetSlot: 'production'

Rollback: If monitoring reveals issues after the swap, just swap again — the old version is still running in the staging slot, warm and ready.

Canary Deployments

A canary deployment routes a small percentage of traffic to the new version while monitoring for errors. If metrics look good, gradually increase traffic until the new version serves 100%. If metrics degrade, route all traffic back to the old version.

Canary Strategy in Azure Pipelines

# Canary deployment — progressively shift traffic to the new version
# Uses the canary strategy with lifecycle hooks for traffic management

stages:
  - stage: CanaryDeploy
    displayName: 'Canary Deployment to Production'
    jobs:
      - deployment: canaryRelease
        pool:
          vmImage: 'ubuntu-latest'
        environment:
          name: 'production-k8s'
          resourceType: Kubernetes
        strategy:
          canary:
            # Start with 10% of traffic to the new version
            increments: [10, 25, 50, 100]
            preDeploy:
              steps:
                - script: echo "Starting canary deployment — initial: 10% traffic"
                  displayName: 'Canary start notification'
            deploy:
              steps:
                - script: |
                    # Deploy the canary version alongside stable
                    echo "Deploying canary with $(Strategy.Increment)% traffic"
                    kubectl apply -f k8s/canary-deployment.yml
                  displayName: 'Deploy canary version'
            routeTraffic:
              steps:
                - script: |
                    # Update traffic split to current increment percentage
                    echo "Routing $(Strategy.Increment)% traffic to canary"
                    kubectl patch virtualservice myapp \
                      --type merge -p \
                      '{"spec":{"http":[{"route":[{"destination":{"host":"myapp-stable"},"weight":'$((100-$(Strategy.Increment)))'},{"destination":{"host":"myapp-canary"},"weight":$(Strategy.Increment)}]}]}}'
                  displayName: 'Route traffic ($(Strategy.Increment)%)'
            postRouteTraffic:
              steps:
                - script: |
                    # Monitor error rate for 5 minutes at this traffic level
                    echo "Monitoring canary at $(Strategy.Increment)% for 5 minutes..."
                    sleep 300
                    # Check error rate (simplified — use Azure Monitor in practice)
                    ERROR_RATE=$(curl -s http://metrics-server/error-rate)
                    if [ "$ERROR_RATE" -gt "5" ]; then
                      echo "Error rate too high: ${ERROR_RATE}%"
                      exit 1  # Triggers on.failure → rollback
                    fi
                  displayName: 'Monitor canary health'
            on:
              failure:
                steps:
                  - script: |
                      echo "❌ Canary failed! Rolling back to stable..."
                      kubectl delete -f k8s/canary-deployment.yml
                    displayName: 'Rollback canary'
              success:
                steps:
                  - script: echo "✅ Canary promoted to 100% — deployment complete"
                    displayName: 'Canary promotion success'

The increments: [10, 25, 50, 100] array defines the traffic progression. At each step, postRouteTraffic validates health before proceeding to the next percentage. If any check fails, the on.failure hook fires and rolls back.

Ring Deployments

A ring deployment expands the audience progressively through named rings — from internal teams to beta users to the global population. Unlike canary (which is traffic-percentage-based), rings are audience-based.

Typical Ring Structure

Ring Audience Purpose Duration
Ring 0 Internal team (10 people) "Eat your own dog food" — catch obvious issues 1 day
Ring 1 Early adopters (1,000 users) Validate with real usage patterns 2–3 days
Ring 2 All users (100,000+) General availability Permanent
# Ring deployment pipeline — 3 rings with approvals between each
# Each ring targets a different environment or feature flag audience

trigger:
  branches:
    include: [main]

stages:
  - stage: Build
    jobs:
      - job: BuildAndTest
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - script: dotnet build --configuration Release
          - script: dotnet test --no-build
          - publish: $(Build.ArtifactStagingDirectory)
            artifact: 'app'

  # Ring 0: Internal team
  - stage: Ring0_Internal
    displayName: 'Ring 0 — Internal Team'
    dependsOn: Build
    jobs:
      - deployment: deployRing0
        pool:
          vmImage: 'ubuntu-latest'
        environment: 'ring0-internal'
        strategy:
          runOnce:
            deploy:
              steps:
                - download: current
                  artifact: 'app'
                - script: |
                    echo "Deploying to internal ring (Ring 0)"
                    echo "Enabling feature flag for internal users only"
                    # Deploy to internal-only App Service or set feature flag
                  displayName: 'Deploy to Ring 0'

  # Ring 1: Early adopters (requires approval after Ring 0 bakes)
  - stage: Ring1_EarlyAdopters
    displayName: 'Ring 1 — Early Adopters'
    dependsOn: Ring0_Internal
    jobs:
      - deployment: deployRing1
        pool:
          vmImage: 'ubuntu-latest'
        # This environment has a 24-hour manual approval gate
        environment: 'ring1-early-adopters'
        strategy:
          runOnce:
            deploy:
              steps:
                - download: current
                  artifact: 'app'
                - script: |
                    echo "Expanding to early adopters (Ring 1 — ~1,000 users)"
                    echo "Updating feature flag to include beta users"
                  displayName: 'Deploy to Ring 1'

  # Ring 2: Global rollout (requires approval + Azure Monitor gate)
  - stage: Ring2_Global
    displayName: 'Ring 2 — Global Rollout'
    dependsOn: Ring1_EarlyAdopters
    jobs:
      - deployment: deployRing2
        pool:
          vmImage: 'ubuntu-latest'
        # This environment has both approval AND Azure Monitor gate
        environment: 'ring2-production'
        strategy:
          runOnce:
            deploy:
              steps:
                - download: current
                  artifact: 'app'
                - script: |
                    echo "Global rollout (Ring 2 — all users)"
                    echo "Removing feature flag — new version is now GA"
                  displayName: 'Deploy to Ring 2 (Global)'

The key difference from canary: rings are environment-based with distinct approval policies. Ring 0 auto-deploys. Ring 1 requires a team lead approval after Ring 0 bakes for 24 hours. Ring 2 requires both VP approval and an Azure Monitor gate confirming zero critical alerts from Ring 1.

Complete Multi-Stage Pipeline

This is the culmination of Modules 4–7 — a complete, production-grade pipeline that builds, tests, deploys to staging with blue-green, and promotes to production with approvals:

# Complete multi-stage pipeline: Build → Test → Staging (Blue-Green) → Production
# This pipeline demonstrates everything from Modules 4-7 working together

trigger:
  branches:
    include: [main]
  paths:
    exclude: ['docs/**', '*.md']

variables:
  - group: 'app-settings'                # Variable group for shared config
  - name: buildConfiguration
    value: 'Release'
  - name: appName
    value: 'contoso-webapp'

# ─── STAGE 1: BUILD & TEST ───────────────────────────────────
stages:
  - stage: Build
    displayName: 'Build & Test'
    jobs:
      - job: BuildApp
        pool:
          vmImage: 'ubuntu-latest'
        steps:
          - task: UseDotNet@2
            inputs:
              version: '8.x'

          - script: dotnet restore
            displayName: 'Restore dependencies'

          - script: dotnet build --configuration $(buildConfiguration) --no-restore
            displayName: 'Build application'

          - script: dotnet test --no-build --configuration $(buildConfiguration) --logger trx
            displayName: 'Run unit tests'

          - task: PublishTestResults@2
            inputs:
              testResultsFormat: 'VSTest'
              testResultsFiles: '**/*.trx'

          - script: dotnet publish --configuration $(buildConfiguration) --output $(Build.ArtifactStagingDirectory)
            displayName: 'Publish application'

          - publish: $(Build.ArtifactStagingDirectory)
            artifact: 'drop'
            displayName: 'Upload artifact'

# ─── STAGE 2: DEPLOY TO STAGING (BLUE-GREEN) ─────────────────
  - stage: DeployStaging
    displayName: 'Deploy → Staging (Green Slot)'
    dependsOn: Build
    condition: succeeded()
    jobs:
      - deployment: deployToStaging
        pool:
          vmImage: 'ubuntu-latest'
        environment: 'staging'
        strategy:
          runOnce:
            deploy:
              steps:
                - download: current
                  artifact: 'drop'

                # Deploy to staging slot (green)
                - task: AzureWebApp@1
                  displayName: 'Deploy to staging slot'
                  inputs:
                    azureSubscription: 'Azure-Production'
                    appName: '$(appName)'
                    deployToSlotOrASE: true
                    slotName: 'staging'
                    package: '$(Pipeline.Workspace)/drop/**/*.zip'

                # Validate staging slot health
                - script: |
                    echo "Waiting 30s for app to warm up..."
                    sleep 30
                    HTTP_STATUS=$(curl -o /dev/null -s -w "%{http_code}" \
                      https://$(appName)-staging.azurewebsites.net/health)
                    if [ "$HTTP_STATUS" -ne 200 ]; then
                      echo "❌ Health check failed: HTTP $HTTP_STATUS"
                      exit 1
                    fi
                    echo "✅ Staging slot healthy (HTTP $HTTP_STATUS)"
                  displayName: 'Health check staging slot'

# ─── STAGE 3: SWAP TO PRODUCTION ──────────────────────────────
  - stage: DeployProduction
    displayName: 'Deploy → Production (Swap Slots)'
    dependsOn: DeployStaging
    condition: succeeded()
    jobs:
      - deployment: swapToProduction
        pool:
          vmImage: 'ubuntu-latest'
        # Production environment has: manual approval + business hours gate
        environment: 'production'
        strategy:
          runOnce:
            deploy:
              steps:
                # Swap staging ↔ production (instant blue-green cutover)
                - task: AzureAppServiceManage@0
                  displayName: 'Swap slots: staging → production'
                  inputs:
                    azureSubscription: 'Azure-Production'
                    action: 'Swap Slots'
                    webAppName: '$(appName)'
                    sourceSlot: 'staging'
                    targetSlot: 'production'

                # Post-swap validation
                - script: |
                    HTTP_STATUS=$(curl -o /dev/null -s -w "%{http_code}" \
                      https://$(appName).azurewebsites.net/health)
                    echo "Production health: HTTP $HTTP_STATUS"
                  displayName: 'Post-swap health check'
            on:
              failure:
                steps:
                  # Emergency rollback — swap back
                  - task: AzureAppServiceManage@0
                    displayName: '🚨 ROLLBACK — Swap slots back'
                    inputs:
                      azureSubscription: 'Azure-Production'
                      action: 'Swap Slots'
                      webAppName: '$(appName)'
                      sourceSlot: 'staging'
                      targetSlot: 'production'
Key design decisions in this pipeline:
  • Build once, deploy the same artifact everywhere (no "works on staging" bugs)
  • Blue-green via App Service slots — instant cutover, instant rollback
  • Production environment has approval + business hours gate (configured in portal)
  • The on.failure hook automatically swaps back on any error — self-healing rollback

Case Study: SaaS Product Release Pipeline

Case Study SaaS — 50 microservices, AKS, ring deployment

Deploying 50 Microservices Safely to Production

Challenge: A B2B SaaS company with 50 microservices on Azure Kubernetes Service (AKS) needed to ship features daily while maintaining 99.95% uptime for enterprise customers. Their previous approach — manual deployments every Thursday — created "big bang" releases with frequent rollbacks.

Solution: Ring Deployment with Canary Analysis

Architecture:

  • Each microservice has its own pipeline (triggered by changes to its path)
  • Shared deployment template from a central platform-templates repo (Module 6)
  • Three-ring deployment: Internal → Beta Customers → All Customers

Ring 0 (Internal — 50 engineers):

  • Auto-deploys on merge to main
  • AKS namespace: ring0-internal
  • Feature flags route internal users to Ring 0 pods
  • Bake time: 4 hours minimum

Ring 1 (Beta — 500 customers who opt-in):

  • Requires: Ring 0 passes + automated Azure Monitor gate (error rate < 0.1%)
  • Canary traffic split: 10% → 25% → 50% → 100% of beta users
  • Each increment monitors for 1 hour before proceeding
  • Auto-rollback if error rate exceeds 1% at any increment

Ring 2 (Global — all 10,000+ customers):

  • Requires: Ring 1 stable for 24 hours + Release Manager approval + zero P1 alerts
  • Uses blue-green with AKS virtual services (Istio)
  • Business hours restriction: Mon–Thu, 9am–3pm (never Friday)
Results After 6 Months
  • Deployment frequency: Thursday-only → multiple times per day
  • Failed deployments reaching customers: 12/quarter → 0/quarter
  • Mean time to recovery (MTTR): 45 minutes → under 2 minutes (auto-rollback)
  • Uptime: 99.91% → 99.98% (exceeded SLA)
  • Developer satisfaction: "We ship features confidently now, not nervously"
Ring Deployment Canary AKS Zero-Downtime

Exercises

Exercise 1 Difficulty: Intermediate

Three-Stage Pipeline with Production Approval

Goal: Build a multi-stage pipeline that deploys through dev → staging → production with a manual approval gate on production.

  1. Create a simple web application (any language).
  2. Write a 3-stage YAML pipeline: Build, Deploy-Staging, Deploy-Production.
  3. Create environments named dev, staging, and production in Azure DevOps.
  4. Add a manual approval check to the production environment (your own account as approver).
  5. Run the pipeline and verify it pauses at the production stage waiting for approval.

Success criteria: Pipeline runs Build and Staging automatically, then waits for your manual approval before deploying to Production.

Exercise 2 Difficulty: Intermediate

Blue-Green Deployment with App Service Slots

Goal: Implement a blue-green deployment pattern using Azure App Service deployment slots.

  1. Create an Azure App Service with a "staging" deployment slot.
  2. Create a Workload Identity Federation service connection.
  3. Write a pipeline that deploys to the staging slot, runs a health check, then swaps.
  4. Intentionally deploy a broken version and verify you can swap back to restore service.

Success criteria: You can deploy, validate, swap to production, and roll back within 60 seconds if needed.

Exercise 3 Difficulty: Advanced

Automated Gates with Azure Monitor

Goal: Configure automated gates that check Azure Monitor before promoting to production.

  1. Deploy an application with Application Insights enabled.
  2. Create an Azure Monitor alert rule that fires when error rate exceeds 5%.
  3. Add an "Azure Monitor alerts" gate to your production environment.
  4. Trigger the pipeline and verify the gate passes when no alerts are active.
  5. Generate errors (hit a broken endpoint), verify the alert fires and the gate blocks deployment.

Success criteria: Deployment to production is automatically blocked when Azure Monitor detects elevated error rates, and proceeds when alerts clear.

Exercise 4 Difficulty: Advanced

Ring Deployment Pipeline with Canary Analysis

Goal: Build a 3-ring deployment pipeline that progressively rolls out to wider audiences with automated health monitoring.

  1. Create three environments: ring0-internal, ring1-beta, ring2-production.
  2. Ring 0 deploys automatically on merge to main.
  3. Ring 1 requires Ring 0 success + 4-hour bake time (simulate with a short delay).
  4. Ring 2 requires Ring 1 success + manual approval + Azure Monitor gate.
  5. Add a REST API gate between Ring 1 and Ring 2 that calls a custom health endpoint.

Success criteria: Code flows through all three rings automatically (except for the manual approval at Ring 2), with health checks at each stage preventing bad deployments from expanding.

Next in the Bootcamp

In Module 8: Azure Test Plans, we'll cover manual and exploratory testing, test suites and configurations, test case management, capturing feedback from stakeholders, and integrating test results with your deployment gates.