Back to Infrastructure & Cloud Automation Series

Part 10: Infrastructure Security

May 14, 2026 Wasil Zafar 50 min read

Master infrastructure security from identity and access management to network security, secrets management, compliance frameworks, and zero-trust architecture — protecting your cloud environments at every layer.

Table of Contents

  1. Why Infrastructure Security Matters
  2. Identity & Access Management
  3. Network Security
  4. Secrets Management
  5. Encryption
  6. Zero-Trust Architecture
  7. Compliance & Governance
  8. Security Automation
  9. Hands-On Exercises
  10. Conclusion & Next Steps

Why Infrastructure Security Matters

In Part 9, we mastered Terraform fundamentals for provisioning infrastructure. But deploying infrastructure without security is like building a house without locks — everything you create is exposed. Infrastructure security is the discipline of protecting your cloud resources, data, and workloads from unauthorized access, data breaches, and service disruption.

A single misconfigured S3 bucket, an overly permissive IAM role, or a leaked API key can result in catastrophic data breaches costing millions. In 2023 alone, the average cost of a data breach reached $4.45 million globally. Infrastructure security isn't optional — it's fundamental.

Infrastructure Security encompasses the policies, practices, and tools that protect cloud resources from unauthorized access, data exfiltration, service disruption, and compliance violations. It operates at every layer — from identity and network to data and application — following the principle of defense in depth.

The Shared Responsibility Model

As we discussed in Part 7 (Cloud Fundamentals), cloud security follows a shared responsibility model. The cloud provider secures the infrastructure of the cloud, while you secure what you put in the cloud:

LayerIaaS (You Manage)PaaS (Shared)SaaS (Provider Manages)
Data & AccessYouYouYou
ApplicationsYouYouProvider
OS & RuntimeYouProviderProvider
Network ControlsYouSharedProvider
Physical InfrastructureProviderProviderProvider

The Principle of Least Privilege

Every identity — human user, service account, or application — should have only the minimum permissions required to perform its function. This limits the blast radius when credentials are compromised:

Never use root/owner accounts for daily operations. Create dedicated service accounts with scoped permissions. An over-privileged compromised credential gives attackers the keys to your entire kingdom.

Common Attack Vectors

Understanding how attackers target cloud infrastructure helps prioritize defenses:

  • Credential exposure — Secrets committed to Git repositories, leaked in logs, or stored in plaintext
  • Misconfigured storage — Public S3 buckets, open database ports, unauthenticated APIs
  • Overly permissive IAM — Wildcard policies (*:*), unused admin roles
  • Unencrypted data — Data at rest or in transit without encryption
  • Network exposure — Services directly exposed to the internet without WAF or DDoS protection
  • Supply chain attacks — Compromised dependencies, malicious Terraform modules
  • Lateral movement — Once inside, attackers pivot through overly connected networks
Defense in Depth — Multiple Security Layers
flowchart TB
    subgraph L1["Layer 1: Identity & Access"]
        A[IAM Policies] --> B[MFA]
        B --> C[Service Accounts]
    end
    subgraph L2["Layer 2: Network"]
        D[VPC / VNet] --> E[Security Groups]
        E --> F[WAF / DDoS]
    end
    subgraph L3["Layer 3: Data"]
        G[Encryption at Rest] --> H[Encryption in Transit]
        H --> I[Key Management]
    end
    subgraph L4["Layer 4: Application"]
        J[Input Validation] --> K[Dependency Scanning]
        K --> L[Runtime Protection]
    end
    subgraph L5["Layer 5: Monitoring"]
        M[Audit Logs] --> N[Alerting]
        N --> O[Incident Response]
    end
    L1 --> L2 --> L3 --> L4 --> L5
                            

Identity & Access Management (IAM)

IAM is the foundation of cloud security. It controls who (authentication) can do what (authorization) on which resources. Every cloud provider implements IAM differently, but the core concepts remain consistent.

Authentication vs Authorization

ConceptAuthentication (AuthN)Authorization (AuthZ)
QuestionWho are you?What can you do?
MechanismPasswords, MFA, certificates, tokensPolicies, roles, permissions
WhenAt login / request initiationAfter identity is verified
ExampleUser logs in with OIDC tokenPolicy allows s3:GetObject on specific bucket

AWS IAM

AWS IAM uses JSON policy documents attached to users, groups, or roles. The policy language evaluates Effect (Allow/Deny), Action (API calls), Resource (ARN), and optional Conditions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowS3ReadOnly",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-app-bucket",
        "arn:aws:s3:::my-app-bucket/*"
      ],
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": "10.0.0.0/16"
        }
      }
    }
  ]
}

Key AWS IAM concepts:

  • Users — Long-lived credentials for human operators (avoid for applications)
  • Groups — Collections of users sharing the same policies
  • Roles — Temporary credentials assumed by services or cross-account access
  • Instance Profiles — Roles attached to EC2 instances for service-level access
  • Trust Relationships — Define which principals can assume a role

Terraform example — creating a least-privilege IAM role for a Lambda function:

# IAM role for Lambda with least-privilege access
resource "aws_iam_role" "lambda_role" {
  name = "my-lambda-execution-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "lambda.amazonaws.com"
        }
      }
    ]
  })
}

# Scoped policy — only access specific DynamoDB table
resource "aws_iam_role_policy" "lambda_dynamodb" {
  name = "lambda-dynamodb-access"
  role = aws_iam_role.lambda_role.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "dynamodb:GetItem",
          "dynamodb:PutItem",
          "dynamodb:Query"
        ]
        Resource = "arn:aws:dynamodb:us-east-1:123456789012:table/orders"
      },
      {
        Effect = "Allow"
        Action = [
          "logs:CreateLogGroup",
          "logs:CreateLogStream",
          "logs:PutLogEvents"
        ]
        Resource = "arn:aws:logs:*:*:*"
      }
    ]
  })
}

Azure Entra ID & RBAC

Azure uses Entra ID (formerly Azure AD) for identity and Role-Based Access Control (RBAC) for authorization. Key concepts include service principals, managed identities, and built-in roles:

# Azure — create a user-assigned managed identity
resource "azurerm_user_assigned_identity" "app_identity" {
  name                = "my-app-identity"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location
}

# Assign "Storage Blob Data Reader" role to the managed identity
resource "azurerm_role_assignment" "blob_reader" {
  scope                = azurerm_storage_account.main.id
  role_definition_name = "Storage Blob Data Reader"
  principal_id         = azurerm_user_assigned_identity.app_identity.principal_id
}

# Assign "Key Vault Secrets User" role
resource "azurerm_role_assignment" "keyvault_reader" {
  scope                = azurerm_key_vault.main.id
  role_definition_name = "Key Vault Secrets User"
  principal_id         = azurerm_user_assigned_identity.app_identity.principal_id
}
Managed Identities eliminate the need for credentials entirely. Azure automatically provisions and rotates the identity's credentials — no secrets to manage, no keys to rotate, no risk of credential leakage.

GCP IAM

GCP IAM uses service accounts with predefined or custom roles. Workload Identity Federation allows external workloads to access GCP without service account keys:

# GCP — create a service account with minimal permissions
resource "google_service_account" "app_sa" {
  account_id   = "my-app-service-account"
  display_name = "My App Service Account"
  project      = var.project_id
}

# Grant only BigQuery Data Viewer role
resource "google_project_iam_member" "bq_viewer" {
  project = var.project_id
  role    = "roles/bigquery.dataViewer"
  member  = "serviceAccount:${google_service_account.app_sa.email}"
}

# Grant Cloud Storage Object Viewer on specific bucket
resource "google_storage_bucket_iam_member" "bucket_viewer" {
  bucket = google_storage_bucket.data.name
  role   = "roles/storage.objectViewer"
  member = "serviceAccount:${google_service_account.app_sa.email}"
}

Federation & SSO

For organizations with existing identity providers (Okta, Azure AD, Google Workspace), federation allows single sign-on (SSO) into cloud environments without creating cloud-native users:

OIDC Federation Flow
sequenceDiagram
    participant User
    participant IdP as Identity Provider
(Okta / Azure AD) participant STS as Cloud STS
(AWS STS / Azure Token) participant Cloud as Cloud Resources User->>IdP: Authenticate (MFA) IdP->>User: OIDC Token (JWT) User->>STS: AssumeRoleWithWebIdentity(token) STS->>STS: Validate token, check trust policy STS->>User: Temporary credentials (15min-12hr) User->>Cloud: API call with temporary credentials Cloud->>Cloud: Evaluate IAM policies Cloud->>User: Response (Allow/Deny)

IAM Comparison Across Clouds

FeatureAWSAzureGCP
Identity ProviderIAM Users / Identity CenterEntra ID (Azure AD)Google Cloud Identity
Machine IdentityIAM Roles + Instance ProfilesManaged IdentitiesService Accounts
Policy LanguageJSON (Effect/Action/Resource)JSON (RBAC role definitions)IAM bindings (role + member)
Temporary CredsSTS AssumeRole (1-12hr)Token-based (configurable)Short-lived tokens
Cross-AccountCross-account roles + trust policiesLighthouse / cross-tenantCross-project IAM bindings
FederationSAML 2.0, OIDCSAML, OIDC, WS-FedWorkload Identity Federation
MFAVirtual MFA, hardware tokensAuthenticator, FIDO2, SMS2-Step Verification
Best PracticeNo root access keys; use rolesUse managed identities alwaysNever export SA keys

Network Security

Network security creates boundaries around your resources, controlling which traffic can reach which services. Cloud networks provide multiple layers of filtering — from subnet-level ACLs to instance-level security groups to application-level WAFs.

Security Groups (Stateful Packet Filtering)

Security Groups (AWS) and Network Security Groups (Azure) act as virtual firewalls around individual resources. They are stateful — if you allow inbound traffic, the return traffic is automatically allowed:

# AWS Security Group — web server allowing HTTP/HTTPS + SSH from bastion only
resource "aws_security_group" "web_server" {
  name        = "web-server-sg"
  description = "Security group for web servers"
  vpc_id      = aws_vpc.main.id

  # Allow HTTP from anywhere
  ingress {
    description = "HTTP from internet"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # Allow HTTPS from anywhere
  ingress {
    description = "HTTPS from internet"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # Allow SSH only from bastion security group
  ingress {
    description     = "SSH from bastion only"
    from_port       = 22
    to_port         = 22
    protocol        = "tcp"
    security_groups = [aws_security_group.bastion.id]
  }

  # Allow all outbound
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "web-server-sg"
  }
}

Network ACLs (Stateless Subnet Rules)

Network ACLs operate at the subnet level and are stateless — you must explicitly define both inbound and outbound rules. They provide an additional layer of defense beyond security groups:

# AWS Network ACL — restrict public subnet traffic
resource "aws_network_acl" "public" {
  vpc_id     = aws_vpc.main.id
  subnet_ids = [aws_subnet.public.id]

  # Allow inbound HTTP
  ingress {
    protocol   = "tcp"
    rule_no    = 100
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 80
    to_port    = 80
  }

  # Allow inbound HTTPS
  ingress {
    protocol   = "tcp"
    rule_no    = 110
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 443
    to_port    = 443
  }

  # Allow inbound ephemeral ports (return traffic)
  ingress {
    protocol   = "tcp"
    rule_no    = 120
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 1024
    to_port    = 65535
  }

  # Deny all other inbound
  ingress {
    protocol   = "-1"
    rule_no    = 200
    action     = "deny"
    cidr_block = "0.0.0.0/0"
    from_port  = 0
    to_port    = 0
  }

  # Allow all outbound
  egress {
    protocol   = "-1"
    rule_no    = 100
    action     = "allow"
    cidr_block = "0.0.0.0/0"
    from_port  = 0
    to_port    = 0
  }

  tags = { Name = "public-nacl" }
}

Web Application Firewalls & DDoS Protection

WAFs inspect HTTP/HTTPS traffic at Layer 7, blocking common attacks like SQL injection, XSS, and bot traffic before they reach your application:

ServiceAWSAzureGCP
WAFAWS WAF (CloudFront/ALB)Azure WAF (App Gateway/Front Door)Cloud Armor
DDoSAWS Shield Standard (free) / AdvancedAzure DDoS Protection StandardCloud Armor DDoS
Bot ProtectionAWS WAF Bot ControlAzure WAF bot protectionreCAPTCHA Enterprise
Managed RulesAWS Managed Rules (OWASP)OWASP 3.2 Core Rule SetPre-configured WAF rules

Private Endpoints & PrivateLink

Private endpoints allow you to access cloud services (storage, databases, APIs) over your private network rather than the public internet:

# Azure Private Endpoint for Storage Account
resource "azurerm_private_endpoint" "storage" {
  name                = "pe-storage"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  subnet_id           = azurerm_subnet.private.id

  private_service_connection {
    name                           = "psc-storage"
    private_connection_resource_id = azurerm_storage_account.main.id
    subresource_names              = ["blob"]
    is_manual_connection           = false
  }

  private_dns_zone_group {
    name                 = "dns-zone-group"
    private_dns_zone_ids = [azurerm_private_dns_zone.blob.id]
  }
}

Bastion Hosts & Secure Access

Never expose SSH or RDP ports directly to the internet. Use bastion hosts (jump boxes) or managed bastion services as a single, audited entry point:

Network Security Architecture
flowchart LR
    subgraph Internet["Public Internet"]
        U[Users / Admins]
    end
    subgraph Edge["Edge Layer"]
        WAF[WAF + DDoS]
        LB[Load Balancer]
    end
    subgraph Public["Public Subnet"]
        BAS[Bastion Host]
    end
    subgraph Private["Private Subnet"]
        APP[App Servers]
        DB[(Database)]
    end
    subgraph Isolated["Isolated Subnet"]
        SEC[Secrets / Keys]
    end

    U -->|HTTPS| WAF
    WAF --> LB
    LB -->|Port 443| APP
    U -->|SSH via Bastion| BAS
    BAS -->|Port 22| APP
    APP -->|Port 5432| DB
    APP -->|Port 443| SEC

    style Internet fill:#fee,stroke:#c00
    style Private fill:#efe,stroke:#090
    style Isolated fill:#eef,stroke:#009
                            
# AWS Systems Manager Session Manager — no SSH needed
resource "aws_iam_role_policy_attachment" "ssm" {
  role       = aws_iam_role.ec2_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

# Connect without opening port 22:
# aws ssm start-session --target i-0123456789abcdef0

Secrets Management

Secrets — API keys, database passwords, certificates, encryption keys — are the most valuable targets for attackers. Hard-coding secrets in source code, environment variables, or configuration files is one of the most common and dangerous security mistakes.

Never commit secrets to version control. In 2023, GitHub reported detecting over 12 million leaked secrets in public repositories. Once a secret is committed, it exists in Git history forever — even after deletion. Always use dedicated secrets management services.

Cloud-Native Secrets Services

FeatureAWS Secrets ManagerAzure Key VaultGCP Secret Manager
Secret TypesAny key-value, RDS credentialsSecrets, keys, certificatesAny binary/text blob
Auto-RotationBuilt-in for RDS, Lambda-basedEvent Grid triggeredPub/Sub triggered
VersioningAutomatic with staging labelsVersion historyAutomatic versioning
Access ControlIAM + resource policiesRBAC + access policiesIAM bindings
EncryptionAES-256 via KMSHSM-backed keysGoogle-managed or CMEK
AuditCloudTrail logs all accessDiagnostic logsCloud Audit Logs
Pricing$0.40/secret/month + API calls$0.03/10K operations$0.06/10K access operations
# AWS — Store database credentials in Secrets Manager
resource "aws_secretsmanager_secret" "db_credentials" {
  name                    = "prod/database/credentials"
  description             = "Production database credentials"
  recovery_window_in_days = 7

  tags = {
    Environment = "production"
    ManagedBy   = "terraform"
  }
}

resource "aws_secretsmanager_secret_version" "db_credentials" {
  secret_id = aws_secretsmanager_secret.db_credentials.id
  secret_string = jsonencode({
    username = "app_user"
    password = var.db_password  # Pass via environment variable, never hardcode
    host     = aws_db_instance.main.endpoint
    port     = 5432
    dbname   = "myapp"
  })
}
# Azure — Store secrets in Key Vault
resource "azurerm_key_vault" "main" {
  name                = "myapp-keyvault-prod"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  tenant_id           = data.azurerm_client_config.current.tenant_id
  sku_name            = "standard"

  purge_protection_enabled   = true
  soft_delete_retention_days = 90

  # Use RBAC for access control (recommended over access policies)
  enable_rbac_authorization = true
}

resource "azurerm_key_vault_secret" "db_password" {
  name         = "database-password"
  value        = var.db_password
  key_vault_id = azurerm_key_vault.main.id

  content_type    = "password"
  expiration_date = "2027-01-01T00:00:00Z"
}

HashiCorp Vault

For multi-cloud environments, HashiCorp Vault provides a unified secrets management platform with dynamic secrets, encryption as a service, and fine-grained access control:

# Initialize and unseal Vault
vault operator init -key-shares=5 -key-threshold=3
vault operator unseal <key-1>
vault operator unseal <key-2>
vault operator unseal <key-3>

# Enable the KV secrets engine
vault secrets enable -version=2 -path=secret kv

# Store a secret
vault kv put secret/myapp/database \
  username="app_user" \
  password="s3cur3P@ssw0rd" \
  host="db.example.com"

# Read a secret
vault kv get secret/myapp/database

# Enable dynamic database credentials
vault secrets enable database
vault write database/config/myapp \
  plugin_name=postgresql-database-plugin \
  connection_url="postgresql://{{username}}:{{password}}@db.example.com:5432/myapp" \
  allowed_roles="readonly" \
  username="vault_admin" \
  password="admin_password"

# Create a role that generates temporary credentials
vault write database/roles/readonly \
  db_name=myapp \
  creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
  default_ttl="1h" \
  max_ttl="24h"

Secret Rotation Strategies

Secrets should be rotated regularly to limit the window of exposure if compromised:

# AWS Secrets Manager — automatic rotation every 30 days
resource "aws_secretsmanager_secret_rotation" "db_rotation" {
  secret_id           = aws_secretsmanager_secret.db_credentials.id
  rotation_lambda_arn = aws_lambda_function.secret_rotation.arn

  rotation_rules {
    automatically_after_days = 30
  }
}

Encryption

Encryption protects data confidentiality at two critical stages: at rest (stored data) and in transit (data moving between services). Modern cloud platforms provide encryption services that handle the complexity of key management.

Encryption at Rest

ServiceAWSAzureGCP
Key ManagementAWS KMSAzure Key VaultCloud KMS
Default EncryptionSSE-S3 (all S3 objects)Storage Service EncryptionGoogle-managed keys
Customer KeysSSE-KMS (CMK)Customer-managed keysCMEK
Hardware HSMCloudHSMManaged HSMCloud HSM
Disk EncryptionEBS encryption (AES-256)Azure Disk EncryptionPersistent Disk encryption
DatabaseRDS encryption, DynamoDBTDE for SQL, Cosmos DBCloud SQL encryption
# AWS KMS — create a customer-managed encryption key
resource "aws_kms_key" "data_key" {
  description             = "KMS key for application data encryption"
  deletion_window_in_days = 30
  enable_key_rotation     = true  # Automatic annual rotation

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid    = "AllowKeyAdministration"
        Effect = "Allow"
        Principal = {
          AWS = "arn:aws:iam::123456789012:role/key-admin"
        }
        Action   = "kms:*"
        Resource = "*"
      },
      {
        Sid    = "AllowKeyUsage"
        Effect = "Allow"
        Principal = {
          AWS = "arn:aws:iam::123456789012:role/app-role"
        }
        Action = [
          "kms:Encrypt",
          "kms:Decrypt",
          "kms:GenerateDataKey"
        ]
        Resource = "*"
      }
    ]
  })
}

resource "aws_kms_alias" "data_key" {
  name          = "alias/app-data-key"
  target_key_id = aws_kms_key.data_key.key_id
}

# S3 bucket with CMK encryption
resource "aws_s3_bucket_server_side_encryption_configuration" "encrypted" {
  bucket = aws_s3_bucket.data.id

  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.data_key.arn
    }
    bucket_key_enabled = true  # Reduces KMS API calls
  }
}

Encryption in Transit

All data moving between services must be encrypted using TLS 1.2+ minimum. Enforce HTTPS-only access and use certificate management services:

# AWS — enforce HTTPS-only on S3 bucket
resource "aws_s3_bucket_policy" "enforce_https" {
  bucket = aws_s3_bucket.data.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid       = "EnforceHTTPS"
        Effect    = "Deny"
        Principal = "*"
        Action    = "s3:*"
        Resource = [
          aws_s3_bucket.data.arn,
          "${aws_s3_bucket.data.arn}/*"
        ]
        Condition = {
          Bool = {
            "aws:SecureTransport" = "false"
          }
        }
      }
    ]
  })
}

# AWS ACM — provision a TLS certificate
resource "aws_acm_certificate" "main" {
  domain_name               = "app.example.com"
  subject_alternative_names = ["*.app.example.com"]
  validation_method         = "DNS"

  lifecycle {
    create_before_destroy = true
  }
}

Envelope Encryption

Envelope encryption is the standard pattern used by KMS services. Instead of encrypting data directly with a master key, a data key encrypts the data, and the master key encrypts the data key:

Envelope Encryption Flow
flowchart LR
    subgraph KMS["Key Management Service"]
        MK[Master Key
Never leaves KMS] end subgraph App["Application"] A[Generate Data Key
request to KMS] --> B[Plaintext Data Key +
Encrypted Data Key] B --> C[Encrypt data with
Plaintext Data Key] C --> D[Discard Plaintext
Data Key from memory] end subgraph Storage["Encrypted Storage"] E[Encrypted Data +
Encrypted Data Key] end subgraph Decrypt["Decryption"] F[Send Encrypted Data Key
to KMS] --> G[KMS decrypts with
Master Key] G --> H[Use Plaintext Data Key
to decrypt data] end MK -.->|Encrypts/Decrypts
Data Key| B D --> E E --> F
Why Envelope Encryption? Master keys have size limits (4KB in AWS KMS). By encrypting data with a separate data key, you can encrypt unlimited data while keeping the master key secure in hardware. The encrypted data key is stored alongside the encrypted data — both are useless without the master key.

Certificate Management

# Verify TLS certificate on a domain
openssl s_client -connect app.example.com:443 -servername app.example.com 2>/dev/null | \
  openssl x509 -noout -dates -subject -issuer

# Check certificate expiry
echo | openssl s_client -connect app.example.com:443 2>/dev/null | \
  openssl x509 -noout -enddate

# Generate a self-signed cert for development
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem \
  -sha256 -days 365 -nodes \
  -subj "/C=US/ST=State/L=City/O=Org/CN=localhost"

Zero-Trust Architecture

Traditional perimeter-based security assumes everything inside the network is trusted. Zero-trust flips this model: never trust, always verify. Every request is authenticated, authorized, and encrypted regardless of where it originates.

Core Principles

  • Verify explicitly — Always authenticate and authorize based on all available data points (identity, location, device health, data classification)
  • Use least-privilege access — Limit access with just-in-time (JIT) and just-enough-access (JEA)
  • Assume breach — Minimize blast radius, segment access, verify end-to-end encryption, use analytics for threat detection

Microsegmentation

Instead of one large trusted network, microsegmentation divides the network into small, isolated zones. Each workload communicates only with explicitly allowed peers:

# Microsegmentation — each service has its own security group
# Only allow specific service-to-service communication

resource "aws_security_group" "api_service" {
  name   = "api-service-sg"
  vpc_id = aws_vpc.main.id

  # Only accept traffic from the load balancer
  ingress {
    from_port       = 8080
    to_port         = 8080
    protocol        = "tcp"
    security_groups = [aws_security_group.alb.id]
  }
}

resource "aws_security_group" "payment_service" {
  name   = "payment-service-sg"
  vpc_id = aws_vpc.main.id

  # Only accept traffic from the API service
  ingress {
    from_port       = 8080
    to_port         = 8080
    protocol        = "tcp"
    security_groups = [aws_security_group.api_service.id]
  }
}

resource "aws_security_group" "database" {
  name   = "database-sg"
  vpc_id = aws_vpc.main.id

  # Only accept traffic from the payment service on PostgreSQL port
  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.payment_service.id]
  }
}

BeyondCorp Model

Google's BeyondCorp is the reference implementation of zero-trust. Access decisions are based on the user's identity, device state, and context — not network location:

Zero-Trust vs Traditional Perimeter Security
flowchart TB
    subgraph Traditional["Traditional Perimeter Model"]
        direction TB
        T1[Firewall] --> T2[Trusted Internal Network]
        T2 --> T3[All resources accessible
once inside VPN] T3 --> T4[Lateral movement possible] end subgraph ZeroTrust["Zero-Trust Model"] direction TB Z1[Identity Verification
Every Request] --> Z2[Device Health Check] Z2 --> Z3[Context Evaluation
Location, Time, Risk] Z3 --> Z4[Microsegmented Access
Only specific resources] Z4 --> Z5[Continuous Monitoring
& Re-evaluation] end style Traditional fill:#fee,stroke:#c00 style ZeroTrust fill:#efe,stroke:#090
Zero-Trust Implementation Steps: (1) Identify your protect surface (critical data, assets, applications, services). (2) Map transaction flows. (3) Build a zero-trust architecture around the protect surface. (4) Create zero-trust policies. (5) Monitor and maintain continuously.

Compliance & Governance

Compliance frameworks provide standardized security controls that organizations must implement based on their industry, geography, and data types. Infrastructure as code makes compliance auditable, repeatable, and enforceable.

Common Compliance Frameworks

FrameworkFocusWho Needs ItKey Requirements
SOC 2Service organization controlsSaaS companies, cloud servicesSecurity, availability, confidentiality, privacy, processing integrity
ISO 27001Information security managementAny organization globallyRisk assessment, access control, cryptography, operations security
PCI-DSSPayment card data protectionAnyone processing card paymentsNetwork segmentation, encryption, access control, monitoring
HIPAAHealth information protectionUS healthcare organizationsPHI encryption, access controls, audit trails, breach notification
GDPRPersonal data protectionOrganizations handling EU dataData minimization, consent, right to erasure, breach notification
FedRAMPUS federal cloud servicesCloud providers to US governmentNIST 800-53 controls, continuous monitoring, POA&M

Policy as Code

Policy as code tools enforce security rules before infrastructure is deployed, preventing non-compliant resources from being created:

# Sentinel Policy (Terraform Enterprise/Cloud) — enforce encryption
import "tfplan"

# Ensure all S3 buckets have encryption enabled
main = rule {
  all tfplan.resources.aws_s3_bucket as _, instances {
    all instances as _, r {
      r.applied.server_side_encryption_configuration is not null
    }
  }
}
# OPA/Rego Policy — deny resources without required tags
package terraform.analysis

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_instance"
  not resource.change.after.tags.Environment
  msg := sprintf("EC2 instance '%s' must have an 'Environment' tag", [resource.name])
}

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_instance"
  not resource.change.after.tags.Owner
  msg := sprintf("EC2 instance '%s' must have an 'Owner' tag", [resource.name])
}

Audit Logging

Comprehensive audit logging is required by every compliance framework. Enable and centralize logs for all API calls, data access, and authentication events:

# AWS CloudTrail — log all API calls across the organization
resource "aws_cloudtrail" "org_trail" {
  name                          = "organization-trail"
  s3_bucket_name                = aws_s3_bucket.cloudtrail.id
  include_global_service_events = true
  is_multi_region_trail         = true
  is_organization_trail         = true
  enable_log_file_validation    = true

  cloud_watch_logs_group_arn = "${aws_cloudwatch_log_group.cloudtrail.arn}:*"
  cloud_watch_logs_role_arn  = aws_iam_role.cloudtrail_cw.arn

  event_selector {
    read_write_type           = "All"
    include_management_events = true

    data_resource {
      type   = "AWS::S3::Object"
      values = ["arn:aws:s3"]
    }
  }

  tags = {
    Compliance = "SOC2"
    ManagedBy  = "terraform"
  }
}

# Ensure CloudTrail logs cannot be deleted
resource "aws_s3_bucket_lifecycle_configuration" "cloudtrail_retention" {
  bucket = aws_s3_bucket.cloudtrail.id

  rule {
    id     = "retain-logs"
    status = "Enabled"

    transition {
      days          = 90
      storage_class = "GLACIER"
    }

    expiration {
      days = 2555  # 7 years for compliance
    }
  }
}
# Azure — enable diagnostic settings for activity logs
resource "azurerm_monitor_diagnostic_setting" "activity_logs" {
  name                       = "send-to-log-analytics"
  target_resource_id         = data.azurerm_subscription.current.id
  log_analytics_workspace_id = azurerm_log_analytics_workspace.main.id

  enabled_log {
    category = "Administrative"
  }
  enabled_log {
    category = "Security"
  }
  enabled_log {
    category = "Alert"
  }
  enabled_log {
    category = "Policy"
  }
}

Security Automation

Security must be integrated into the CI/CD pipeline — not bolted on after deployment. "Shift left" security catches vulnerabilities during development, not in production.

Static Analysis & Scanning Tools

ToolTypeWhat It ScansLanguage
tfsecStatic analysisTerraform code for misconfigurationsGo
CheckovPolicy scannerTerraform, CloudFormation, Kubernetes, DockerfilesPython
TrivyVulnerability scannerContainer images, IaC, SBOM, file systemsGo
KICSIaC scannerTerraform, Ansible, Docker, K8s, CloudFormationGo
ProwlerCloud security auditAWS/Azure/GCP running infrastructurePython
ScoutSuiteMulti-cloud auditRunning cloud configurationsPython
Snyk IaCCommercial scannerTerraform, K8s, ARM templatesSaaS
# Install and run tfsec against Terraform code
# tfsec scans for security misconfigurations in .tf files
brew install tfsec  # macOS
tfsec .             # Scan current directory
tfsec . --format json --out results.json  # Output as JSON

# Install and run Checkov
pip install checkov
checkov -d .                          # Scan current directory
checkov -d . --framework terraform    # Terraform only
checkov -d . --check CKV_AWS_18      # Specific check only

# Run Trivy on Terraform files
brew install trivy
trivy config .                        # Scan IaC configurations
trivy config . --severity HIGH,CRITICAL  # Only high/critical

Security in CI/CD Pipelines

# GitHub Actions — security scanning pipeline
name: Infrastructure Security Scan

on:
  pull_request:
    paths:
      - 'terraform/**'
      - '.github/workflows/security.yml'

jobs:
  security-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run tfsec
        uses: aquasecurity/tfsec-action@v1.0.0
        with:
          working_directory: terraform/
          soft_fail: false  # Fail the pipeline on findings

      - name: Run Checkov
        uses: bridgecrewio/checkov-action@v12
        with:
          directory: terraform/
          framework: terraform
          output_format: sarif
          output_file_path: checkov-results.sarif
          soft_fail: false

      - name: Upload SARIF results
        uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: checkov-results.sarif

      - name: Run Trivy IaC scan
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'config'
          scan-ref: 'terraform/'
          severity: 'HIGH,CRITICAL'
          exit-code: '1'

  terraform-plan:
    needs: security-scan
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3

      - name: Terraform Init
        run: terraform init
        working-directory: terraform/

      - name: Terraform Plan
        run: terraform plan -out=tfplan
        working-directory: terraform/

      - name: Validate plan with OPA
        uses: open-policy-agent/opa-action@v2
        with:
          input: terraform/tfplan.json
          policy: policies/

Automated Remediation

Automated remediation closes the gap between detection and resolution. When a non-compliant resource is detected, automation can fix it immediately:

# AWS Config Rule + Auto Remediation
# Automatically encrypt unencrypted S3 buckets

resource "aws_config_rule" "s3_encryption" {
  name = "s3-bucket-server-side-encryption-enabled"

  source {
    owner             = "AWS"
    source_identifier = "S3_BUCKET_SERVER_SIDE_ENCRYPTION_ENABLED"
  }

  scope {
    compliance_resource_types = ["AWS::S3::Bucket"]
  }
}

resource "aws_config_remediation_configuration" "s3_encryption" {
  config_rule_name = aws_config_rule.s3_encryption.name
  target_type      = "SSM_DOCUMENT"
  target_id        = "AWS-EnableS3BucketEncryption"

  parameter {
    name         = "BucketName"
    resource_value = "RESOURCE_ID"
  }

  parameter {
    name         = "SSEAlgorithm"
    static_value = "aws:kms"
  }

  automatic                  = true
  maximum_automatic_attempts = 3
  retry_attempt_seconds      = 60
}

Hands-On Exercises

Exercise 1 30 minutes

Write Least-Privilege IAM Policies

Create a Terraform configuration that defines IAM roles following the principle of least privilege:

  1. Create an IAM role for a Lambda function that can only read from a specific DynamoDB table and write logs
  2. Create an IAM role for a CI/CD pipeline that can deploy to a specific S3 bucket and invalidate a specific CloudFront distribution
  3. Add conditions to restrict access by source IP or time of day
  4. Use terraform plan to verify the resources

Validation: Run tfsec against your code — it should report zero IAM-related findings.

IAM Least Privilege Terraform
Exercise 2 25 minutes

Configure Network Security with Terraform

Build a multi-tier network security architecture:

  1. Create a VPC with public, private, and isolated subnets
  2. Define security groups for web servers (HTTP/HTTPS from internet), app servers (traffic only from web tier), and databases (traffic only from app tier)
  3. Add Network ACLs as an additional defense layer on the public subnet
  4. Configure VPC Flow Logs to an S3 bucket for monitoring

Validation: Verify that no security group allows unrestricted inbound access (0.0.0.0/0) on sensitive ports (22, 3306, 5432).

Network Security Security Groups VPC
Exercise 3 35 minutes

Set Up Secrets Management

Implement secrets management using cloud-native services or HashiCorp Vault:

  1. Create an AWS Secrets Manager secret (or Azure Key Vault secret) with Terraform
  2. Configure automatic rotation with a 30-day schedule
  3. Write a script that retrieves the secret at runtime (no hardcoded values)
  4. Verify the secret is encrypted at rest and access is logged in CloudTrail/Azure Monitor

Bonus: Install HashiCorp Vault in dev mode and configure dynamic database credentials.

Secrets Management Vault Rotation
Exercise 4 20 minutes

Run Security Scanning Tools

Scan existing Terraform code with multiple security tools:

  1. Install tfsec, checkov, and trivy on your machine
  2. Run all three tools against the Terraform code from previous exercises
  3. Compare findings — which tools catch which issues?
  4. Fix all HIGH and CRITICAL findings, then re-run to confirm clean output

Validation: All three tools should report zero HIGH/CRITICAL findings after remediation.

tfsec Checkov Security Scanning

Conclusion & Next Steps

Infrastructure security is not a one-time setup — it's a continuous practice that evolves with your architecture. In this article, we covered the essential layers:

  • Identity & Access Management — Authentication, authorization, least privilege across AWS, Azure, and GCP
  • Network Security — Security groups, NACLs, WAFs, private endpoints, and bastion hosts
  • Secrets Management — Cloud-native vaults, HashiCorp Vault, and rotation strategies
  • Encryption — At rest, in transit, envelope encryption, and certificate management
  • Zero-Trust Architecture — Moving beyond perimeter security to identity-based access
  • Compliance & Governance — Frameworks, policy as code, and audit logging
  • Security Automation — Shift-left scanning, CI/CD integration, and automated remediation

The key takeaway: security must be codified. When security controls are defined as code (Terraform, OPA policies, CI/CD pipelines), they become auditable, repeatable, and enforceable across every environment.

Next in the Series

In Part 11: Containers & Orchestration, we'll explore Docker containerization, container networking, Kubernetes architecture, pod scheduling, services and ingress, and managed container services (ECS, AKS, GKE) — building on the security foundations established here.