Introduction
Cloud computing has fundamentally transformed how organizations build, deploy, and operate technology. Yet despite being the backbone of modern infrastructure, it remains widely misunderstood. "The cloud is just someone else's computer" is a popular quip — but it's dangerously oversimplified. Cloud computing isn't just about where your servers live; it's about an entirely different operational model for consuming technology.
In this part, we'll build a complete understanding of cloud computing from the ground up: what it truly means, how service models differ, how the major providers organize their offerings, and how to make sound architectural and economic decisions in the cloud era.
The 5 NIST Essential Characteristics
The National Institute of Standards and Technology (NIST) defines cloud computing through five essential characteristics that any true cloud service must exhibit. If a service lacks any of these, it's not really cloud — it's just hosted infrastructure:
mindmap
root((Cloud Computing))
On-Demand Self-Service
Provision resources without human interaction
API-driven automation
Instant availability
Broad Network Access
Available over standard networks
Accessible from any device
Platform-independent
Resource Pooling
Multi-tenant model
Location independence
Dynamic resource assignment
Rapid Elasticity
Scale up and down automatically
Appear unlimited to consumer
Pay only for what you use
Measured Service
Usage is metered
Pay-per-use billing
Transparent monitoring
| Characteristic | What It Means | Real-World Example |
|---|---|---|
| On-Demand Self-Service | Provision resources (servers, storage, networks) without needing to contact a human | Spin up 100 VMs via API at 2 AM on a Saturday |
| Broad Network Access | Services available over standard networks, accessible from any device or platform | Access your cloud console from a phone, laptop, or tablet |
| Resource Pooling | Provider's resources are pooled across multiple tenants with dynamic assignment | Your VM shares physical hardware with other customers (isolated) |
| Rapid Elasticity | Resources can be scaled up or down automatically, appearing unlimited | Auto-scale from 2 to 200 instances during Black Friday traffic |
| Measured Service | Usage is metered, reported, and billed transparently (pay-per-use) | Billed $0.023 per GB-month of storage actually consumed |
Service Models
Cloud service models define the boundary of responsibility between you (the customer) and the cloud provider. Think of it as a spectrum: at one end you manage everything; at the other, the provider manages everything. The four primary models are IaaS, PaaS, SaaS, and FaaS.
graph TB
subgraph "On-Premises (You Manage Everything)"
A1[Application]
A2[Data]
A3[Runtime]
A4[Middleware]
A5[Operating System]
A6[Virtualization]
A7[Servers]
A8[Storage]
A9[Networking]
end
subgraph "IaaS (You Manage OS and Up)"
B1[Application]
B2[Data]
B3[Runtime]
B4[Middleware]
B5[Operating System]
B6[Virtualization — Provider]
B7[Servers — Provider]
B8[Storage — Provider]
B9[Networking — Provider]
end
subgraph "PaaS (You Manage Code and Data)"
C1[Application]
C2[Data]
C3[Runtime — Provider]
C4[Middleware — Provider]
C5[OS — Provider]
C6[Virtualization — Provider]
C7[Servers — Provider]
C8[Storage — Provider]
C9[Networking — Provider]
end
subgraph "SaaS (Provider Manages Everything)"
D1[Application — Provider]
D2[Data — Provider manages infra]
D3[Runtime — Provider]
D4[Middleware — Provider]
D5[OS — Provider]
D6[Virtualization — Provider]
D7[Servers — Provider]
D8[Storage — Provider]
D9[Networking — Provider]
end
IaaS — Infrastructure as a Service
IaaS provides the fundamental building blocks of cloud IT: virtual machines, storage volumes, and networks. You rent raw infrastructure and manage everything from the operating system up. It's the most flexible model but also the most operationally demanding.
You manage: OS, middleware, runtime, application, data, patching, security configuration
Provider manages: Physical hardware, hypervisor, networking fabric, physical security, power/cooling
Examples: AWS EC2, Azure Virtual Machines, GCP Compute Engine, DigitalOcean Droplets
Best for: Lift-and-shift migrations, custom OS requirements, full control over the stack, legacy applications
# Launch an IaaS VM on AWS
aws ec2 run-instances \
--image-id ami-0abcdef1234567890 \
--instance-type t3.medium \
--key-name my-key \
--subnet-id subnet-0123456789abcdef0 \
--security-group-ids sg-0123456789abcdef0 \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=web-server-01}]'
# Launch an IaaS VM on Azure
az vm create \
--resource-group my-rg \
--name web-server-01 \
--image Ubuntu2204 \
--size Standard_B2s \
--admin-username azureuser \
--generate-ssh-keys
# Launch an IaaS VM on GCP
gcloud compute instances create web-server-01 \
--zone=us-central1-a \
--machine-type=e2-medium \
--image-family=ubuntu-2204-lts \
--image-project=ubuntu-os-cloud
PaaS — Platform as a Service
PaaS abstracts away the operating system and runtime environment. You deploy your code and data; the platform handles everything else — OS patching, load balancing, scaling, and runtime updates. This dramatically reduces operational overhead but limits customization.
You manage: Application code, data, application-level configuration
Provider manages: Runtime, middleware, OS, hardware, networking, scaling, patching
Examples: AWS Elastic Beanstalk, Azure App Service, Google App Engine, Heroku, Railway
Best for: Web applications, APIs, microservices, developer productivity, rapid prototyping
# Deploy to Azure App Service (PaaS)
az webapp up \
--resource-group my-rg \
--name my-web-app \
--runtime "PYTHON:3.11" \
--sku B1
# Deploy to Google App Engine (PaaS)
# First create app.yaml in your project root:
# runtime: python311
# instance_class: F2
# automatic_scaling:
# min_instances: 1
# max_instances: 10
gcloud app deploy app.yaml --project=my-project
# Deploy to AWS Elastic Beanstalk (PaaS)
eb init my-app --platform python-3.11 --region us-east-1
eb create production --instance_type t3.small
SaaS — Software as a Service
SaaS is fully managed software delivered over the internet. You don't manage any part of the technology stack — you simply use the application. The provider handles everything: the application code, data infrastructure, scaling, updates, and security.
You manage: User configuration, access control (who can use it), your data within the application
Provider manages: Literally everything else — code, infrastructure, updates, availability
Examples: Microsoft 365, Salesforce, Slack, Zoom, GitHub, Datadog, Snowflake
Best for: End-user productivity, business applications, collaboration, when you want to consume not build
FaaS — Function as a Service
FaaS (often called "serverless compute") takes abstraction further than PaaS. You write individual functions that execute in response to events. There are no servers to provision, no runtime to configure — you pay only for the milliseconds your code actually runs.
You manage: Function code, event triggers, function-level configuration
Provider manages: Execution environment, scaling (including to zero), infrastructure, container lifecycle
Examples: AWS Lambda, Azure Functions, Google Cloud Functions, Cloudflare Workers
Best for: Event-driven architectures, webhooks, data processing pipelines, scheduled tasks, APIs with variable traffic
# Deploy an AWS Lambda function
aws lambda create-function \
--function-name process-order \
--runtime python3.11 \
--handler lambda_function.lambda_handler \
--role arn:aws:iam::123456789012:role/lambda-exec-role \
--zip-file fileb://function.zip \
--timeout 30 \
--memory-size 256
# Deploy an Azure Function
func azure functionapp publish my-function-app
# Deploy a Google Cloud Function
gcloud functions deploy process-order \
--runtime python311 \
--trigger-http \
--allow-unauthenticated \
--region us-central1 \
--memory 256MB
Service Model Comparison
| Layer | On-Premises | IaaS | PaaS | FaaS | SaaS |
|---|---|---|---|---|---|
| Application | You | You | You | You | Provider |
| Data | You | You | You | You | Shared |
| Runtime | You | You | Provider | Provider | Provider |
| Middleware | You | You | Provider | Provider | Provider |
| Operating System | You | You | Provider | Provider | Provider |
| Virtualization | You | Provider | Provider | Provider | Provider |
| Servers | You | Provider | Provider | Provider | Provider |
| Storage | You | Provider | Provider | Provider | Provider |
| Networking | You | Provider | Provider | Provider | Provider |
Deployment Models
Deployment models describe where and how cloud infrastructure is hosted, owned, and shared. The choice of deployment model is driven by requirements around data sovereignty, compliance, latency, and cost.
Public Cloud
Infrastructure owned and operated by a third-party provider (AWS, Azure, GCP), delivered over the public internet. Resources are shared across multiple organizations (multi-tenant), though isolated at the hypervisor level. This is the most common deployment model.
Advantages: No upfront CapEx, global scale, broad service catalog, instant provisioning, elasticity
Disadvantages: Data leaves your premises, shared infrastructure (perception issue), vendor lock-in risk, egress costs
Private Cloud
Infrastructure dedicated to a single organization, either hosted on-premises or by a third party. Provides cloud-like self-service and elasticity but within a controlled environment. Often required for strict regulatory compliance (healthcare, government, finance).
Advantages: Full control, data sovereignty, compliance, customizable security, predictable performance
Disadvantages: High CapEx, limited scale, slow provisioning vs public cloud, requires specialized staff
Examples: VMware vSphere + vRealize, OpenStack, Azure Stack HCI, AWS Outposts
Hybrid Cloud
A combination of public and private cloud environments connected by networking, allowing data and applications to move between them. The key requirement is orchestration — the two environments must work together as a single architecture, not just coexist.
Advantages: Flexibility, keep sensitive workloads private while bursting to public, gradual migration path
Disadvantages: Complexity (networking, identity, security), skills gap, potential latency between environments
Use Cases: Cloud bursting (handle peaks in public cloud), data residency (keep EU data on-premises), gradual migration
Multi-Cloud
Using services from multiple public cloud providers simultaneously. This is distinct from hybrid (which is public + private). Multi-cloud might mean running compute on AWS, databases on GCP, and AI/ML on Azure — or distributing the same workload across providers for resilience.
Advantages: Avoid vendor lock-in, best-of-breed services, geographic reach, negotiating leverage, redundancy
Disadvantages: Massive operational complexity, skill dilution, inconsistent APIs, networking challenges, cost visibility
| Factor | Public Cloud | Private Cloud | Hybrid Cloud | Multi-Cloud |
|---|---|---|---|---|
| CapEx Required | None | Very High | High | None |
| Scalability | Near-infinite | Limited | High (burst to public) | Near-infinite |
| Data Control | Limited (provider region) | Full | Split | Distributed |
| Complexity | Low-Medium | High | Very High | Extreme |
| Vendor Lock-in Risk | High | Low (if open-source) | Medium | Low |
| Best For | Startups, SaaS, variable workloads | Regulated industries, sensitive data | Enterprises migrating, compliance | Large enterprises, best-of-breed |
The Shared Responsibility Model
The shared responsibility model is the single most important concept in cloud security. It defines a clear boundary: the cloud provider secures the infrastructure of the cloud, while you secure everything in the cloud. Misunderstanding this boundary is the #1 cause of cloud security breaches.
graph TB
subgraph "Customer Responsibility (Security IN the Cloud)"
C1[Customer Data]
C2[Platform & Application Management]
C3[Identity & Access Management]
C4[Operating System & Network Configuration]
C5[Client-Side Encryption]
C6[Network Traffic Protection]
end
subgraph "Provider Responsibility (Security OF the Cloud)"
P1[Physical Security — Data Centers]
P2[Hardware — Servers, Storage, Networking]
P3[Hypervisor & Host OS]
P4[Global Network Infrastructure]
P5[Managed Service Infrastructure]
P6[Compliance Certifications]
end
C6 --> P1
What the Cloud Provider Is Responsible For
- Physical security: Guards, biometrics, surveillance, locked cages
- Hardware: Server procurement, maintenance, disposal, firmware updates
- Hypervisor/host OS: Patching and securing the virtualization layer
- Network infrastructure: Backbone connectivity, DDoS protection at the network edge
- Compliance: Achieving and maintaining SOC 2, ISO 27001, PCI DSS certifications for their infrastructure
What the Customer Is Responsible For
- Data classification and encryption: Encrypting sensitive data at rest and in transit
- Identity and access management: MFA, least privilege, role-based access control
- Network security: Security groups, NACLs, WAF rules, VPN configuration
- Application security: Code vulnerabilities, patching your dependencies
- OS patching (IaaS): Keeping guest OS updated and hardened
- Compliance: Ensuring your usage of the cloud meets YOUR regulatory requirements
How Responsibility Shifts by Service Model
| Responsibility | IaaS | PaaS | SaaS |
|---|---|---|---|
| Data Classification & Encryption | Customer | Customer | Customer |
| Identity & Access Management | Customer | Customer | Customer |
| Application Security | Customer | Customer | Provider |
| Network Controls | Customer | Shared | Provider |
| OS Patching | Customer | Provider | Provider |
| Runtime & Middleware | Customer | Provider | Provider |
| Physical Infrastructure | Provider | Provider | Provider |
Cloud Economics
Understanding cloud economics is critical for making sound infrastructure decisions. The shift from on-premises to cloud isn't simply "servers become subscription fees" — it fundamentally changes how organizations think about technology investment.
CapEx vs OpEx
| Aspect | CapEx (On-Premises) | OpEx (Cloud) |
|---|---|---|
| Cost Type | Large upfront investment | Pay-as-you-go, monthly billing |
| Accounting | Depreciated over 3-5 years | Expensed in current period |
| Capacity Planning | Must predict 3-5 years ahead | Adjust monthly or hourly |
| Risk | Over-provision or under-provision | Right-size continuously |
| Time to Deploy | Weeks to months (procurement) | Minutes (API call) |
| Staffing | Need hardware engineers, facility staff | Cloud architects, DevOps engineers |
| Hidden Costs | Power, cooling, floor space, insurance | Egress, cross-region transfer, API calls |
Cloud Pricing Models
Cloud providers offer multiple pricing tiers designed to reward commitment with discounts:
| Pricing Model | Discount | Commitment | Best For | Risk |
|---|---|---|---|---|
| On-Demand | 0% (baseline) | None | Variable workloads, testing, short-term | None — pay only for what you use |
| Reserved (1yr) | ~30-40% | 1-year term | Steady-state production workloads | Committed even if unused |
| Reserved (3yr) | ~50-72% | 3-year term | Long-running databases, core services | Significant lock-in |
| Spot/Preemptible | ~60-90% | None (can be reclaimed) | Batch processing, CI/CD, fault-tolerant | Instances terminated with 2-min notice |
| Savings Plans | ~30-60% | $/hr spend commitment | Flexible workloads across instance types | Must spend minimum per hour |
Total Cost of Ownership (TCO)
A fair cloud vs on-premises comparison must account for all costs, not just the sticker price of a server. TCO includes:
- Hardware costs: Servers, storage, networking equipment, spare parts
- Facility costs: Data center space, power, cooling, physical security, fire suppression
- Personnel costs: Hardware engineers, network engineers, facility managers, 24/7 NOC
- Software licenses: Hypervisor licenses, OS licenses, management tools
- Lifecycle costs: Hardware refresh every 3-5 years, decommissioning, e-waste
- Opportunity cost: Money tied up in depreciating assets vs invested elsewhere
Cost Optimization Strategies
| Strategy | Typical Savings | Implementation Effort | Description |
|---|---|---|---|
| Right-Sizing | 20-40% | Low | Match instance size to actual usage (most VMs are over-provisioned) |
| Reserved Capacity | 30-72% | Low | Commit to 1-3 year terms for steady workloads |
| Spot Instances | 60-90% | Medium | Use interruptible capacity for fault-tolerant workloads |
| Auto-Scaling | 20-50% | Medium | Scale down during off-peak, scale up during peak |
| Scheduled Shutdowns | 40-70% | Low | Turn off dev/test environments nights and weekends |
| Storage Tiering | 50-80% | Low | Move cold data to cheaper tiers (Glacier, Cool, Archive) |
| Architecture Optimization | 30-60% | High | Move from IaaS to PaaS/serverless where appropriate |
The Big Three: AWS, Azure, GCP
The public cloud market is dominated by three hyperscale providers that together control approximately 67% of global cloud spending. Each has distinct strengths, histories, and philosophical approaches to cloud services.
Amazon Web Services (AWS)
Founded: 2006 (first mover) | Market Share: ~31% | Regions: 34+ | Services: 200+
AWS was first to market and has the broadest and deepest service catalog. Its philosophy is "build primitives and let customers compose them." This gives maximum flexibility but can feel overwhelming — AWS often has 3-5 ways to accomplish the same task.
Strengths: Broadest service catalog, largest ecosystem, most mature managed services, strongest serverless platform (Lambda), deepest marketplace
Considerations: Complex pricing, naming conventions can be confusing (SQS, SNS, SES, etc.), console UX is functional but dense
Microsoft Azure
Founded: 2010 | Market Share: ~25% | Regions: 60+ | Services: 200+
Azure's strength is enterprise integration. If your organization runs Microsoft 365, Active Directory, SQL Server, or .NET, Azure provides the tightest integration. Its hybrid story (Azure Arc, Azure Stack) is the strongest in the industry.
Strengths: Enterprise/Microsoft ecosystem integration, strongest hybrid cloud (Arc, Stack HCI), Azure AD/Entra ID for identity, compliance certifications for government, excellent developer experience with VS Code + GitHub
Considerations: Service naming changes frequently, documentation quality varies, some services less mature than AWS equivalents
Google Cloud Platform (GCP)
Founded: 2008 (public 2011) | Market Share: ~11% | Regions: 40+ | Services: 150+
GCP is built on Google's internal infrastructure (Borg → Kubernetes, Spanner, BigQuery). Its strengths are data analytics, machine learning, global networking, and developer experience. GCP's philosophy favors opinionated, well-designed services over breadth.
Strengths: Superior data/analytics (BigQuery), best Kubernetes experience (GKE), global network (private backbone), strong AI/ML (Vertex AI, TPUs), clean API design
Considerations: Smaller service catalog, enterprise features maturing, perception of product deprecation risk
Service Mapping Across Providers
| Category | AWS | Azure | GCP |
|---|---|---|---|
| Virtual Machines | EC2 | Virtual Machines | Compute Engine |
| Serverless Compute | Lambda | Functions | Cloud Functions |
| Containers (Managed K8s) | EKS | AKS | GKE |
| Container Service | ECS / Fargate | Container Apps | Cloud Run |
| Object Storage | S3 | Blob Storage | Cloud Storage |
| Block Storage | EBS | Managed Disks | Persistent Disk |
| File Storage | EFS | Azure Files | Filestore |
| Relational Database | RDS / Aurora | SQL Database / Cosmos DB (SQL) | Cloud SQL / AlloyDB |
| NoSQL Database | DynamoDB | Cosmos DB | Firestore / Bigtable |
| Data Warehouse | Redshift | Synapse Analytics | BigQuery |
| VPC / Networking | VPC | Virtual Network (VNet) | VPC |
| Load Balancer | ALB / NLB / ELB | Load Balancer / App Gateway | Cloud Load Balancing |
| DNS | Route 53 | Azure DNS | Cloud DNS |
| CDN | CloudFront | Azure CDN / Front Door | Cloud CDN |
| IAM | IAM | Entra ID (Azure AD) + RBAC | Cloud IAM |
| Monitoring | CloudWatch | Monitor / App Insights | Cloud Monitoring |
| IaC Service | CloudFormation | ARM / Bicep | Deployment Manager / Config Connector |
| Message Queue | SQS | Service Bus / Queue Storage | Pub/Sub |
| AI/ML Platform | SageMaker | Azure AI / ML Studio | Vertex AI |
Cloud Architecture Patterns
Regions and Availability Zones
Cloud providers organize their infrastructure into Regions (geographic areas like us-east-1, westeurope, asia-east1) and Availability Zones (AZs — isolated data centers within a region connected by low-latency links). Understanding this hierarchy is fundamental to designing resilient architectures.
graph TB
subgraph "AWS Region: us-east-1 (N. Virginia)"
subgraph "AZ: us-east-1a"
DC1[Data Center 1]
DC2[Data Center 2]
end
subgraph "AZ: us-east-1b"
DC3[Data Center 3]
DC4[Data Center 4]
end
subgraph "AZ: us-east-1c"
DC5[Data Center 5]
DC6[Data Center 6]
end
end
DC1 ---|"< 2ms latency"| DC3
DC3 ---|"< 2ms latency"| DC5
DC1 ---|"< 2ms latency"| DC5
| Concept | Description | Failure Domain | Example |
|---|---|---|---|
| Region | Geographic area with 2-6 AZs | Natural disaster, country-level outage | us-east-1, eu-west-1, asia-southeast1 |
| Availability Zone | 1+ data centers with independent power/cooling/networking | Single facility failure (fire, flood, power) | us-east-1a, us-east-1b |
| Edge Location | CDN point of presence for content caching | Local connectivity | CloudFront PoP in Chicago |
| Local Zone | Extension of a region closer to users | Local infrastructure | us-east-1-chi-1 (Chicago) |
High Availability Patterns
High Availability (HA) means designing systems that continue operating even when individual components fail. In cloud, this is primarily achieved by distributing resources across multiple AZs or regions.
graph TB
Users[Users / Internet] --> LB[Load Balancer — Multi-AZ]
subgraph "Availability Zone A"
LB --> WebA[Web Server A]
WebA --> AppA[App Server A]
AppA --> DB_Primary[Database Primary]
end
subgraph "Availability Zone B"
LB --> WebB[Web Server B]
WebB --> AppB[App Server B]
AppB --> DB_Standby[Database Standby — Sync Replication]
end
DB_Primary ---|"Synchronous Replication"| DB_Standby
Key HA principles:
- Eliminate single points of failure: Every component should have a redundant pair
- Use managed services: Managed databases (RDS Multi-AZ) handle failover automatically
- Design for failure: Assume any component can fail at any time
- Test failover regularly: Chaos engineering (Netflix Chaos Monkey approach)
Disaster Recovery Strategies
Disaster Recovery (DR) protects against region-level failures. The four standard DR strategies trade cost against recovery speed:
| Strategy | RTO | RPO | Cost | Description |
|---|---|---|---|---|
| Backup & Restore | Hours | Hours | $ | Regular backups to another region; restore from backup on failure |
| Pilot Light | 10-30 min | Minutes | $$ | Core services running (DB replication); scale up on failure |
| Warm Standby | Minutes | Seconds | $$$ | Scaled-down copy of production running; scale up on failure |
| Active-Active (Multi-Region) | ~0 (automatic) | ~0 | $$$$ | Full production in multiple regions; traffic routes around failures |
RTO = Recovery Time Objective (how long until you're back online)
RPO = Recovery Point Objective (how much data you can afford to lose)
The Well-Architected Framework
All three major providers publish Well-Architected Frameworks that provide guidance across six pillars. While the details differ, the pillars are largely consistent:
| Pillar | Focus | Key Questions |
|---|---|---|
| Operational Excellence | Run and monitor systems, improve processes | How do you respond to unplanned events? How do you evolve? |
| Security | Protect information, systems, and assets | How do you manage identities? How do you detect threats? |
| Reliability | Recover from failures, meet demand | How do you handle component failures? How do you test recovery? |
| Performance Efficiency | Use resources efficiently as demand changes | How do you select the right instance type? How do you monitor? |
| Cost Optimization | Avoid unnecessary costs | How do you govern usage? How do you decommission unused resources? |
| Sustainability | Minimize environmental impact | How do you select efficient regions? How do you right-size? |
Getting Started with Cloud
Account Setup and Security
Before deploying your first resource, secure your cloud account. The majority of cloud security breaches trace back to misconfigured accounts, not sophisticated attacks.
# AWS — Initial account security setup
# 1. Create an IAM admin user (don't use root)
aws iam create-user --user-name admin-user
aws iam attach-user-policy \
--user-name admin-user \
--policy-arn arn:aws:iam::aws:policy/AdministratorAccess
# 2. Set up billing alarm (alerts at $50)
aws cloudwatch put-metric-alarm \
--alarm-name "billing-alarm-50" \
--metric-name EstimatedCharges \
--namespace AWS/Billing \
--statistic Maximum \
--period 21600 \
--threshold 50 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 1 \
--alarm-actions arn:aws:sns:us-east-1:123456789012:billing-alerts
# 3. Enable CloudTrail (audit logging)
aws cloudtrail create-trail \
--name management-trail \
--s3-bucket-name my-cloudtrail-bucket \
--is-multi-region-trail
aws cloudtrail start-logging --name management-trail
# Azure — Initial account security setup
# 1. Create a resource group for organization
az group create --name core-infrastructure --location eastus
# 2. Set up budget alert
az consumption budget create \
--budget-name monthly-budget \
--amount 100 \
--category Cost \
--time-grain Monthly \
--start-date 2026-05-01 \
--end-date 2027-05-01
# 3. Enable diagnostic logging
az monitor diagnostic-settings create \
--name audit-logs \
--resource "/subscriptions/{sub-id}" \
--logs '[{"category":"Administrative","enabled":true}]' \
--storage-account "/subscriptions/{sub-id}/resourceGroups/core-infrastructure/providers/Microsoft.Storage/storageAccounts/auditlogs"
# GCP — Initial account security setup
# 1. Create a project
gcloud projects create my-first-project --name="My First Project"
gcloud config set project my-first-project
# 2. Enable billing budget alerts
gcloud billing budgets create \
--billing-account=BILLING_ACCOUNT_ID \
--display-name="Monthly Budget" \
--budget-amount=100 \
--threshold-rule=percent=50 \
--threshold-rule=percent=90 \
--threshold-rule=percent=100
# 3. Enable audit logging
gcloud projects get-iam-policy my-first-project --format=json > policy.json
# Edit policy.json to add audit logging configuration
gcloud projects set-iam-policy my-first-project policy.json
CLI Tools Overview
Every cloud provider offers a command-line interface that enables infrastructure automation. These are essential tools for any cloud engineer:
| Provider | CLI Tool | Install Command | Auth Command |
|---|---|---|---|
| AWS | aws |
pip install awscli or MSI installer |
aws configure |
| Azure | az |
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash |
az login |
| GCP | gcloud |
curl https://sdk.cloud.google.com | bash |
gcloud auth login |
# Quick verification commands after installation
# AWS — Check identity
aws sts get-caller-identity
# Output: AccountId, UserId, Arn
# Azure — Check account
az account show --output table
# Output: Name, SubscriptionId, TenantId, State
# GCP — Check project
gcloud config list
# Output: account, project, region, zone
Free Tier Overview
All three providers offer free tiers for learning and experimentation. These are invaluable for getting hands-on experience without financial risk:
| Provider | Free Credits | Always-Free Highlights | Gotchas |
|---|---|---|---|
| AWS | 12 months of free tier | 750 hrs/month t2.micro, 5GB S3, 25GB DynamoDB, 1M Lambda requests/month | Some services auto-scale beyond free tier limits |
| Azure | $200 credit (30 days) + 12 months | 750 hrs B1s VM, 5GB Blob Storage, 250GB SQL Database, 1M Functions requests | $200 credit expires in 30 days regardless of usage |
| GCP | $300 credit (90 days) + always-free | 1 e2-micro VM, 5GB Cloud Storage, 1TB BigQuery queries/month, 2M Cloud Functions | Most generous always-free tier for compute |
Hands-On Exercises
Service Model Classification Challenge
Classify each of the following services into the correct service model (IaaS, PaaS, SaaS, or FaaS). For each, explain why it belongs to that category by identifying what the customer manages vs what the provider manages:
- AWS EC2 with a custom AMI
- Google Sheets
- Azure Functions triggered by a queue
- Heroku with a Git-push deploy
- DigitalOcean Droplet running Ubuntu
- Salesforce CRM
- AWS Lambda processing S3 events
- Google App Engine (standard environment)
- Microsoft 365 Exchange Online
- Azure Virtual Machines running Windows Server
- Cloudflare Workers
- AWS RDS (managed PostgreSQL)
- Snowflake Data Warehouse
- GitHub Codespaces
- GCP Compute Engine with custom image
Bonus: For services that blur the line (like managed databases), argue which model they most closely fit and why.
Design a High-Availability Architecture
You're designing a web application for an e-commerce company that requires 99.99% availability (less than 53 minutes downtime per year). Design the architecture on paper (or whiteboard) addressing:
- Compute layer: How many AZs? What happens when one AZ fails?
- Database layer: Primary/standby? Read replicas? Multi-region?
- Load balancing: Where? What type? Health checks?
- Static assets: CDN? Which regions?
- DNS: Failover routing? Latency-based?
- Disaster recovery: Which strategy? What's the RTO/RPO?
Draw a diagram showing the complete architecture. Label each component with the AWS/Azure/GCP service you'd use. Calculate the theoretical availability using the formula: Availability = 1 - (1 - AZ_availability)^num_AZs
Deploy Your First Cloud Resource
Sign up for a free tier account on any cloud provider and deploy a basic resource using the CLI. Follow these steps:
- Create account: Sign up at aws.amazon.com/free, azure.microsoft.com/free, or cloud.google.com/free
- Secure the account: Enable MFA, set up billing alerts at $5 and $10
- Install CLI: Install the provider's CLI tool and authenticate
- Deploy a resource: Create a small VM or storage bucket using the CLI
- Verify: Confirm the resource exists via both CLI and web console
- Clean up: Delete the resource to avoid charges
- Review billing: Check the billing dashboard to confirm $0 charges
Document: Take screenshots of each step. Note what surprised you about the process — what was easier or harder than expected?
Conclusion & Next Steps
Cloud computing is not merely a technology shift — it's an operational paradigm change. In this article, we've covered the essential foundations:
- Service models (IaaS, PaaS, SaaS, FaaS) and their responsibility boundaries
- Deployment models (public, private, hybrid, multi-cloud) and when to use each
- The shared responsibility model — the most critical concept in cloud security
- Cloud economics — CapEx vs OpEx, pricing models, and cost optimization
- The Big Three providers and how their services map to each other
- Architecture patterns — regions, AZs, HA, and DR strategies
With these fundamentals in place, you now have the vocabulary and mental models needed to understand how infrastructure is provisioned, managed, and automated in the cloud era.
Next in the Series
In Part 8: Infrastructure as Code, we'll learn how to define cloud infrastructure declaratively using tools like Terraform, Pulumi, and CloudFormation. You'll go from clicking buttons in a console to expressing your entire infrastructure as version-controlled code that can be reviewed, tested, and deployed automatically.