Part 6: Infrastructure Storage

Introduction

If compute is the brain of your infrastructure, storage is its memory — and arguably its most critical component. Servers can be replaced, networks rerouted, and applications redeployed, but data loss is permanent. Every database record, every customer transaction, every machine learning model, and every compliance log ultimately lives on a storage system.

Understanding storage is essential for every infrastructure engineer because virtually every architectural decision — from choosing a database to designing a disaster recovery plan — depends on the underlying storage characteristics: how fast it is, how reliable it is, and how much it costs.

                            
                            Key Insight: Storage is the one domain where mistakes are truly unforgivable. A misconfigured server wastes money. A misconfigured network causes downtime. But a misconfigured storage system — missing backups, wrong RAID level, no replication — can destroy an entire business.
                        

The Modern Storage Landscape

Modern infrastructure offers three fundamental storage paradigms, each designed for different access patterns:

The Three Storage Paradigms

                                flowchart TD
                                    A[Application] --> B{What access
pattern?}
                                    B -->|Random read/write
at byte level| C[Block Storage]
                                    B -->|HTTP API
whole objects| D[Object Storage]
                                    B -->|Shared filesystem
hierarchical| E[File Storage]
                                    C --> C1[Databases
Boot volumes
Transactions]
                                    D --> D1[Backups
Media files
Data lakes]
                                    E --> E1[Shared data
Home dirs
CMS content]

In this article, we'll start at the hardware level — how data is physically stored on magnetic platters and flash cells — then work our way up through block, object, and file abstractions, storage protocols, cloud architectures, and finally data lifecycle management. By the end, you'll be able to design, provision, and optimize storage for any workload.

Storage Fundamentals

How Data Is Stored at the Hardware Level

Before discussing cloud abstractions, it's important to understand what physically stores your data. The three dominant storage technologies each have radically different performance characteristics:

HDD (Hard Disk Drives) — Magnetic Storage

HDDs store data on spinning magnetic platters read by a mechanical arm. They've been the backbone of data storage since the 1950s and still dominate for high-capacity, cost-sensitive workloads like backups and archives.

Capacity: Up to 30+ TB per drive (2026)
Sequential throughput: 150–250 MB/s
Random IOPS: 75–200 (limited by mechanical seek time)
Latency: 2–10 ms (seek + rotational delay)
Cost: ~$0.015/GB — cheapest per gigabyte

SSD (Solid State Drives) — NAND Flash via SATA/SAS

SSDs use NAND flash memory with no moving parts. Connected via SATA or SAS interfaces, they offer a massive performance leap over HDDs while remaining compatible with existing server infrastructure.

Capacity: Up to 30+ TB per drive
Sequential throughput: 500–600 MB/s (SATA III limit)
Random IOPS: 50,000–100,000
Latency: 25–100 µs
Cost: ~$0.05–0.10/GB

NVMe (Non-Volatile Memory Express) — Flash via PCIe

NVMe eliminates the SATA/SAS bottleneck by connecting flash storage directly to the PCIe bus. This parallel, low-latency protocol was designed from the ground up for flash, unlocking performance that traditional interfaces cannot match.

Capacity: Up to 30+ TB per drive
Sequential throughput: 3,000–14,000 MB/s (PCIe Gen4/Gen5)
Random IOPS: 500,000–2,000,000
Latency: 10–20 µs
Cost: ~$0.08–0.15/GB

                            
                            Why NVMe Is a Game-Changer: SATA uses a single command queue with 32 commands deep. NVMe supports 65,535 queues, each 65,536 commands deep — a 134-million-fold increase in parallelism. This is why NVMe drives can sustain millions of IOPS while SATA SSDs plateau at ~100K.
                        

Block vs Object vs File Storage

These three paradigms differ in how data is organized, accessed, and managed:

Characteristic	Block Storage	Object Storage	File Storage
Data unit	Fixed-size blocks (512B–4KB)	Variable-size objects (bytes to TBs)	Files in hierarchical directories
Access method	Device path (e.g., /dev/sda1)	HTTP REST API (GET/PUT/DELETE)	Mount point + POSIX path
Metadata	Minimal (managed by filesystem)	Rich custom metadata per object	Standard file attributes (owner, perms)
Modification	In-place block overwrites	Replace entire object (immutable)	In-place byte-level edits
Scalability	TB scale (single volume)	Exabyte scale (flat namespace)	TB–PB scale (shared mount)
Performance	Lowest latency (µs)	Higher latency (ms, HTTP overhead)	Moderate latency (network dependent)
Best for	Databases, OS boot, transactions	Backups, media, data lakes, archives	Shared files, home dirs, CMS

Storage Performance Metrics

Three metrics define storage performance. Understanding their interplay is critical for capacity planning:

IOPS (Input/Output Operations Per Second): How many read/write operations the device can handle per second. Critical for databases and transactional workloads with many small random reads/writes.
Throughput (MB/s or GB/s): How much data can be transferred per second. Critical for streaming workloads like video processing, backups, and data analytics that read/write large sequential blocks.
Latency (ms or µs): The time delay between requesting data and receiving it. Critical for real-time applications, interactive databases, and anything user-facing.

                            
                            The Relationship: Throughput = IOPS × Block Size. A drive doing 10,000 IOPS with a 4KB block size delivers 40 MB/s throughput. The same drive doing 10,000 IOPS with a 256KB block size delivers 2.5 GB/s. Always consider the block size your application uses.
                        

RAID Levels

RAID (Redundant Array of Independent Disks) combines multiple physical drives into a single logical unit for improved performance, redundancy, or both. Understanding RAID is essential because cloud storage tiers are essentially managed RAID with network replication.

RAID Level	Description	Min Disks	Usable Capacity	Fault Tolerance	Read Perf	Write Perf	Best For
RAID 0	Striping (no redundancy)	2	100%	None	Excellent	Excellent	Scratch/temp data
RAID 1	Mirroring	2	50%	1 disk failure	Good	Moderate	OS/boot volumes
RAID 5	Striping + distributed parity	3	(N−1)/N	1 disk failure	Good	Moderate	General purpose
RAID 6	Striping + double parity	4	(N−2)/N	2 disk failures	Good	Slow	Critical data archives
RAID 10	Mirrored stripes	4	50%	1 per mirror pair	Excellent	Good	Databases

RAID 10 Architecture (Mirror + Stripe)

                                flowchart TB
                                    subgraph RAID10["RAID 10 (4 Disks)"]
                                        direction TB
                                        STRIPE["Stripe Layer (RAID 0)"]
                                        subgraph M1["Mirror Pair 1 (RAID 1)"]
                                            D1["Disk 1
A1, A2, A3"]
                                            D2["Disk 2
A1, A2, A3"]
                                        end
                                        subgraph M2["Mirror Pair 2 (RAID 1)"]
                                            D3["Disk 3
B1, B2, B3"]
                                            D4["Disk 4
B1, B2, B3"]
                                        end
                                        STRIPE --> M1
                                        STRIPE --> M2
                                    end

                            
                            Warning — RAID Is Not a Backup: RAID protects against disk failure, not against accidental deletion, ransomware, corruption, or site-level disasters. You always need separate backups in addition to RAID.
                        

Block Storage

What Is Block Storage?

Block storage presents raw storage volumes to compute instances as virtual hard drives. The operating system sees a block device (e.g., /dev/xvdf) and can format it with any filesystem (ext4, XFS, NTFS), partition it, and use it exactly like a local physical disk.

Block storage is the performance king — it offers the lowest latency and highest IOPS because there's no HTTP overhead or metadata layer between the application and the data. This makes it the only viable option for:

Databases: PostgreSQL, MySQL, MongoDB, and every other database engine requires low-latency random I/O that only block storage provides
Boot volumes: Operating systems must be installed on block devices
Transactional workloads: Payment processing, order management, or any system requiring consistent sub-millisecond I/O
High-performance computing: Scientific simulations, financial modeling

Cloud Block Storage Services

Every cloud provider offers managed block storage with multiple performance tiers. Understanding these tiers is critical for cost optimization — choosing io2 when gp3 would suffice can cost 10x more.

Provider / Type	IOPS	Throughput	Max Size	Use Case	~Cost/GB/mo
AWS EBS gp3	3,000 baseline (up to 16,000)	125 MB/s (up to 1,000)	16 TB	General purpose	$0.08
AWS EBS io2	Up to 256,000	Up to 4,000 MB/s	64 TB	Mission-critical databases	$0.125
AWS EBS st1	500 (baseline)	500 MB/s	16 TB	Throughput-intensive (logs, big data)	$0.045
Azure Premium SSD v2	Up to 80,000	Up to 1,200 MB/s	64 TB	Production databases	$0.082
Azure Standard SSD	Up to 6,000	Up to 750 MB/s	32 TB	Web servers, dev/test	$0.048
GCP pd-ssd	Up to 100,000	Up to 1,200 MB/s	64 TB	Enterprise databases	$0.170
GCP pd-balanced	Up to 80,000	Up to 1,200 MB/s	64 TB	General purpose	$0.100

Provisioning Block Storage with Terraform

Infrastructure as Code is the standard way to provision cloud block storage. Here's a complete Terraform example that creates an EBS volume and attaches it to an EC2 instance:

# Terraform: Create and attach an EBS volume to an EC2 instance
# Provider: AWS | Run: terraform init && terraform apply

provider "aws" {
  region = "us-east-1"
}

# Data source: get the latest Amazon Linux 2023 AMI
data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]
  filter {
    name   = "name"
    values = ["al2023-ami-*-x86_64"]
  }
}

# EC2 instance
resource "aws_instance" "app_server" {
  ami               = data.aws_ami.amazon_linux.id
  instance_type     = "t3.medium"
  availability_zone = "us-east-1a"

  tags = {
    Name = "storage-demo-server"
  }
}

# GP3 volume for general workloads
resource "aws_ebs_volume" "app_data" {
  availability_zone = "us-east-1a"
  size              = 100    # GB
  type              = "gp3"
  iops              = 5000   # Custom IOPS (gp3 allows this)
  throughput        = 250    # MB/s

  encrypted  = true
  kms_key_id = "alias/ebs-key"

  tags = {
    Name        = "app-data-volume"
    Environment = "production"
    Backup      = "daily"
  }
}

# Attach the volume to the instance
resource "aws_volume_attachment" "app_data_attach" {
  device_name = "/dev/xvdf"
  volume_id   = aws_ebs_volume.app_data.id
  instance_id = aws_instance.app_server.id
}

# Output the volume ID for reference
output "volume_id" {
  value = aws_ebs_volume.app_data.id
}

After Terraform provisions the volume, you'd SSH into the instance and format/mount it:

# Format and mount a newly attached EBS volume on Amazon Linux
# These commands run on the EC2 instance after the volume is attached

# Check that the volume is visible
lsblk

# Create an XFS filesystem (recommended for Linux production workloads)
sudo mkfs.xfs /dev/xvdf

# Create mount point and mount
sudo mkdir -p /data
sudo mount /dev/xvdf /data

# Verify the mount
df -h /data

# Add to /etc/fstab for persistence across reboots
# Use UUID instead of device name (device names can change)
UUID=$(sudo blkid -s UUID -o value /dev/xvdf)
echo "UUID=$UUID /data xfs defaults,nofail 0 2" | sudo tee -a /etc/fstab

# Verify fstab entry works
sudo umount /data
sudo mount -a
df -h /data

Object Storage

What Is Object Storage?

Object storage stores data as discrete objects in a flat namespace, each identified by a unique key. Unlike block storage (which operates at the byte level) or file storage (which uses hierarchical directories), object storage treats each piece of data as a self-contained unit with three components:

Data: The actual content — a file, image, backup, log archive, or any binary blob
Metadata: Rich, custom key-value pairs (content-type, owner, retention-class, custom tags)
Unique identifier: A globally unique key (e.g., s3://my-bucket/backups/2026/05/14/db-snapshot.tar.gz)

Object Storage Architecture

                                flowchart LR
                                    CLIENT[Client App] -->|PUT /object| API[HTTP REST API]
                                    API --> BUCKET[Bucket / Container]
                                    BUCKET --> O1["Object 1
Key: logs/app.log
Size: 2MB
Tags: env=prod"]
                                    BUCKET --> O2["Object 2
Key: images/hero.jpg
Size: 500KB
Tags: public=true"]
                                    BUCKET --> O3["Object 3
Key: backups/db.sql.gz
Size: 50GB
Tags: retain=7yr"]

Object storage excels at scale. While a single block volume tops out at 16–64 TB, object storage services like S3 can hold virtually unlimited data — individual objects up to 5 TB, with no limit on the total number of objects per bucket.

Cloud Object Storage Services

All major cloud providers offer object storage with multiple storage classes optimized for different access frequencies:

Storage Class	AWS S3	Azure Blob	GCP Cloud Storage	~Cost/GB/mo	Use Case
Hot / Standard	S3 Standard	Hot tier	Standard	$0.023	Frequently accessed data
Infrequent Access	S3 Standard-IA	Cool tier	Nearline	$0.0125	Monthly access, 30-day min
Cold	S3 Glacier Instant	Cold tier	Coldline	$0.004	Quarterly access, 90-day min
Archive	S3 Glacier Deep Archive	Archive tier	Archive	$0.00099	Yearly access, hours to retrieve
Intelligent	S3 Intelligent-Tiering	Lifecycle mgmt	Autoclass	Varies	Unknown/changing access patterns

S3 Lifecycle Policies

Lifecycle policies automate data movement between storage tiers, reducing costs without manual intervention. Here's how to configure them using the AWS CLI:

# Create a lifecycle policy for an S3 bucket
# This moves objects through tiers and eventually deletes them

# First, create the lifecycle configuration JSON
cat > /tmp/lifecycle-policy.json <<'EOF'
{
  "Rules": [
    {
      "ID": "LogRetentionPolicy",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "logs/"
      },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER_IR"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 2555
      }
    }
  ]
}
EOF

# Apply the lifecycle policy to the bucket
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-application-logs \
  --lifecycle-configuration file:///tmp/lifecycle-policy.json

# Verify the policy was applied
aws s3api get-bucket-lifecycle-configuration \
  --bucket my-application-logs

S3 Lifecycle Policy Flow

                                flowchart LR
                                    A["S3 Standard
$0.023/GB/mo"] -->|30 days| B["Standard-IA
$0.0125/GB/mo"]
                                    B -->|90 days| C["Glacier Instant
$0.004/GB/mo"]
                                    C -->|365 days| D["Deep Archive
$0.00099/GB/mo"]
                                    D -->|7 years| E["Deleted"]

Consistency Models

Understanding consistency is critical when building distributed systems on object storage:

Strong consistency: After a successful PUT, all subsequent GETs return the latest version. AWS S3 provides strong read-after-write consistency for all operations as of December 2020.
Eventual consistency: After a write, there's a brief window where reads may return stale data. Some legacy object stores and certain operations (like bucket listing propagation) may exhibit eventual consistency.

                            
                            Key Insight: AWS S3's move to strong consistency in 2020 was a landmark change. Previously, applications had to implement complex retry logic for read-after-write scenarios. With strong consistency, S3 now behaves predictably for all operations — no more stale reads after PUT or DELETE.
                        

File Storage

What Is File Storage?

File storage provides a shared filesystem that multiple compute instances can mount simultaneously over a network. Data is organized in a familiar hierarchical directory structure (/home/user/documents/report.pdf) and accessed via standard POSIX operations (open, read, write, close).

File storage fills the gap between block storage (single-instance, high performance) and object storage (unlimited scale, HTTP access). It's ideal when:

Multiple servers need shared access: A web server fleet serving the same static content
Applications expect a filesystem: Legacy apps that read/write files can't be rewritten for S3 APIs
Home directories: User home directories shared across login servers
Content management: WordPress, Drupal, or other CMS platforms storing uploaded media

Cloud File Storage Services

Feature	AWS EFS	Azure Files	GCP Filestore
Protocol	NFSv4.1	SMB 3.0, NFS 4.1	NFSv3
Max size	Petabyte scale (elastic)	100 TB per share	63.9 TB per instance
Performance modes	Bursting, Provisioned, Elastic	Standard, Premium	Basic HDD/SSD, Enterprise
Multi-AZ	Yes (Standard class)	Yes (ZRS, GRS)	Yes (Enterprise tier)
Auto-scaling	Yes (grows/shrinks)	No (pre-provisioned)	No (pre-provisioned)
Best for	Linux workloads, containers	Windows/hybrid workloads	HPC, media rendering
~Cost/GB/mo	$0.30 (Standard), $0.016 (IA)	$0.06 (Hot), $0.015 (Cool)	$0.20 (Basic SSD)

# Mount an AWS EFS filesystem on an EC2 instance
# Prerequisites: amazon-efs-utils package installed, security group allows NFS (port 2049)

# Install the EFS mount helper
sudo yum install -y amazon-efs-utils

# Create mount point
sudo mkdir -p /mnt/shared

# Mount using the EFS mount helper (TLS encryption in transit)
sudo mount -t efs -o tls fs-0123456789abcdef0:/ /mnt/shared

# Verify the mount
df -h /mnt/shared
mount | grep efs

# Add to /etc/fstab for persistence
echo "fs-0123456789abcdef0:/ /mnt/shared efs _netdev,tls 0 0" | sudo tee -a /etc/fstab

Storage Protocols

Storage protocols define how data travels between compute instances and storage systems. Choosing the right protocol depends on your workload type, network infrastructure, and performance requirements.

iSCSI (Internet Small Computer Systems Interface)

iSCSI carries SCSI block commands over TCP/IP networks, making remote storage appear as locally attached disks. It's the standard for enterprise SAN (Storage Area Network) access over existing Ethernet infrastructure.

# Configure an iSCSI initiator on Linux to connect to a SAN target
# This makes remote block storage appear as a local disk

# Install iSCSI initiator tools
sudo apt install -y open-iscsi

# Discover available targets on the SAN
sudo iscsiadm --mode discovery --type sendtargets --portal 10.0.1.100

# Log in to a specific target
sudo iscsiadm --mode node \
  --targetname iqn.2026-01.com.example:storage.lun0 \
  --portal 10.0.1.100 --login

# Verify the new block device appeared
lsblk
# You should see a new disk like /dev/sdb

# Format and mount (same as any block device)
sudo mkfs.xfs /dev/sdb
sudo mkdir -p /mnt/san-volume
sudo mount /dev/sdb /mnt/san-volume

NFS (Network File System)

NFS is the standard protocol for sharing filesystems across Unix/Linux machines. Version 4.1 (pNFS) added parallel data access for improved performance, and it's the protocol used by AWS EFS and GCP Filestore.

# Set up an NFS server and client on Linux
# Server side: share the /exports/shared directory

# Install NFS server
sudo apt install -y nfs-kernel-server

# Create the shared directory
sudo mkdir -p /exports/shared
sudo chown nobody:nogroup /exports/shared

# Configure the export (allow 10.0.0.0/24 subnet to access)
echo "/exports/shared 10.0.0.0/24(rw,sync,no_subtree_check,no_root_squash)" \
  | sudo tee -a /etc/exports

# Apply the export configuration
sudo exportfs -ra

# Verify exports
sudo exportfs -v

# NFS Client: mount the shared filesystem
# Run this on any machine in the 10.0.0.0/24 subnet

# Install NFS client
sudo apt install -y nfs-common

# Create mount point and mount
sudo mkdir -p /mnt/nfs-share
sudo mount -t nfs4 10.0.0.50:/exports/shared /mnt/nfs-share

# Verify
df -h /mnt/nfs-share
ls -la /mnt/nfs-share

# Persist across reboots
echo "10.0.0.50:/exports/shared /mnt/nfs-share nfs4 defaults,_netdev 0 0" \
  | sudo tee -a /etc/fstab

SMB/CIFS (Server Message Block)

SMB is the native Windows file-sharing protocol (also called CIFS in older versions). It's the protocol used by Azure Files and is essential for Windows-heavy environments and hybrid cloud scenarios.

# Mount an Azure Files SMB share on a Linux server
# Prerequisites: storage account name and key from Azure portal

# Install CIFS utilities
sudo apt install -y cifs-utils

# Create credentials file (avoid putting passwords in fstab)
sudo bash -c 'cat > /etc/smbcredentials/azurefiles.cred <<EOF
username=mystorageaccount
password=YOUR_STORAGE_ACCOUNT_KEY_HERE
EOF'
sudo chmod 600 /etc/smbcredentials/azurefiles.cred

# Create mount point and mount
sudo mkdir -p /mnt/azure-share
sudo mount -t cifs \
  //mystorageaccount.file.core.windows.net/myshare \
  /mnt/azure-share \
  -o credentials=/etc/smbcredentials/azurefiles.cred,serverino,nosharesock,actimeo=30

# Verify
df -h /mnt/azure-share

S3 API — The De Facto Object Storage Standard

The S3 API has become the universal standard for object storage. Originally proprietary to AWS, it's now implemented by Azure (via compatibility layers), GCP, MinIO, Ceph, and dozens of other storage systems. If you learn one storage API, make it S3.

# Common S3 operations using the AWS CLI
# These commands work with any S3-compatible storage

# Create a bucket
aws s3 mb s3://my-data-lake-2026

# Upload a single file
aws s3 cp backup.tar.gz s3://my-data-lake-2026/backups/

# Upload an entire directory recursively
aws s3 sync ./local-data/ s3://my-data-lake-2026/datasets/ --exclude "*.tmp"

# List objects with human-readable sizes
aws s3 ls s3://my-data-lake-2026/backups/ --human-readable --summarize

# Download a file
aws s3 cp s3://my-data-lake-2026/backups/backup.tar.gz ./restored/

# Generate a pre-signed URL (temporary access, expires in 1 hour)
aws s3 presign s3://my-data-lake-2026/reports/quarterly.pdf --expires-in 3600

# Delete objects older than 30 days (using find + xargs pattern)
aws s3 ls s3://my-data-lake-2026/logs/ --recursive \
  | awk '{print $4}' \
  | while read key; do
      aws s3 rm "s3://my-data-lake-2026/$key"
    done

Storage Protocol Comparison

Protocol	Type	Transport	Use Case	Performance
iSCSI	Block	TCP/IP	SAN, databases, VMs	High (network-dependent)
NFS	File	TCP/IP	Linux shared filesystems	Moderate to high
SMB/CIFS	File	TCP/IP	Windows file shares, hybrid	Moderate
S3 API	Object	HTTPS	Data lakes, backups, media	High throughput, higher latency
FC (Fibre Channel)	Block	Dedicated fabric	Enterprise SAN, databases	Highest (dedicated network)

Cloud Storage Architecture

Durability vs Availability

These two terms are often confused but represent fundamentally different guarantees:

Durability: The probability that data will not be lost over a given period. AWS S3 offers 99.999999999% (11 nines) durability, meaning if you store 10 million objects, you can expect to lose a single object once every 10,000 years.
Availability: The percentage of time the storage service is operational and accessible. S3 Standard offers 99.99% availability, meaning ~53 minutes of downtime per year.

                            
                            11 Nines Explained: S3 achieves 99.999999999% durability by automatically replicating each object across a minimum of 3 physically separated Availability Zones within a region. Each AZ has independent power, cooling, and networking. For data to be lost, three separate facilities would need to simultaneously experience unrecoverable failures — an astronomically unlikely event.
                        

Replication Strategies

Storage Replication Strategies

                                flowchart TB
                                    subgraph SRR["Same-Region Replication"]
                                        A1[AZ-1 Copy] --- A2[AZ-2 Copy] --- A3[AZ-3 Copy]
                                    end
                                    subgraph CRR["Cross-Region Replication"]
                                        B1["US-East (Primary)"] -->|Async replicate| B2["EU-West (Replica)"]
                                    end
                                    subgraph MRR["Multi-Region"]
                                        C1["US-East"] --- C2["EU-West"] --- C3["AP-Southeast"]
                                    end

Same-Region Replication (SRR): Data replicated across AZs within one region. Provides durability against facility-level failures. Default for S3 Standard. Lowest cost, lowest latency for regional access.
Cross-Region Replication (CRR): Asynchronous replication to a different AWS region. Required for compliance (data sovereignty) and disaster recovery. Adds storage + data transfer costs.
Multi-Region / Global: Active-active replication across multiple regions for globally distributed applications. Highest availability and lowest latency worldwide, but most expensive.

Encryption at Rest and in Transit

Modern cloud storage provides multiple encryption layers:

Encryption at rest: Data is encrypted on disk. Options include service-managed keys (SSE-S3), AWS KMS keys (SSE-KMS), or customer-provided keys (SSE-C). Azure offers similar options with platform-managed or customer-managed keys.
Encryption in transit: Data is encrypted while traveling over the network using TLS 1.2+. Enforced by bucket policies requiring aws:SecureTransport.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyUnencryptedTransport",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::my-secure-bucket",
        "arn:aws:s3:::my-secure-bucket/*"
      ],
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    },
    {
      "Sid": "DenyUnencryptedUploads",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::my-secure-bucket/*",
      "Condition": {
        "StringNotEquals": {
          "s3:x-amz-server-side-encryption": "aws:kms"
        }
      }
    }
  ]
}

Access Control

Cloud storage offers layered access control mechanisms:

IAM policies: Identity-based policies attached to users, groups, or roles. Best practice for controlling who can access which buckets.
Bucket/container policies: Resource-based policies attached to the storage resource itself. Used for cross-account access and condition-based restrictions.
ACLs (Access Control Lists): Legacy per-object permissions. AWS recommends disabling ACLs on new buckets in favor of IAM + bucket policies.
SAS tokens (Azure): Shared Access Signatures provide time-limited, permission-scoped access to Azure Blob Storage without sharing account keys.
Pre-signed URLs (AWS): Temporary URLs that grant time-limited access to private S3 objects without requiring AWS credentials.

Cost Optimization

Storage costs can spiral quickly at scale. Key optimization strategies:

Use lifecycle policies: Automatically transition data to cheaper tiers as it ages (covered in the Object Storage section).
Enable Intelligent-Tiering: For unpredictable access patterns, let the cloud provider automatically move objects between tiers based on actual usage.
Compress before storing: Use gzip/zstd compression for logs and text data before uploading. 5:1 compression ratios are common.
Delete what you don't need: Set expiration rules for temporary data (build artifacts, test outputs, old logs).
Right-size block volumes: EBS volumes charge for provisioned capacity, not used capacity. Monitor utilization and shrink over-provisioned volumes.
Use S3 Storage Lens: AWS provides analytics dashboards showing storage usage patterns, helping identify optimization opportunities.

Data Lifecycle Management

Hot / Warm / Cold / Archive Tiers

Data lifecycle management is the practice of automatically moving data between storage tiers based on access frequency, age, and business value. The principle is simple: store data on the cheapest tier that meets its access requirements.

Data Lifecycle Tiers

                                flowchart LR
                                    subgraph HOT["Hot Tier"]
                                        H1["Active data
Frequent access
$$$"]
                                    end
                                    subgraph WARM["Warm Tier"]
                                        W1["Recent data
Monthly access
$$"]
                                    end
                                    subgraph COLD["Cold Tier"]
                                        C1["Aging data
Quarterly access
$"]
                                    end
                                    subgraph ARCHIVE["Archive Tier"]
                                        A1["Compliance data
Yearly or never
¢"]
                                    end
                                    HOT -->|"30 days"| WARM
                                    WARM -->|"90 days"| COLD
                                    COLD -->|"365 days"| ARCHIVE
                                    ARCHIVE -->|"Retention met"| DEL["Delete"]

Tier	Access Frequency	Retrieval Time	Typical Data	Storage Cost	Retrieval Cost
Hot	Multiple times daily	Milliseconds	Active databases, APIs, sessions	Highest	None/lowest
Warm	Weekly to monthly	Milliseconds	Recent logs, reports, backups	Medium	Low per-GB fee
Cold	Quarterly	Milliseconds to minutes	Compliance data, old backups	Low	Medium per-GB fee
Archive	Yearly or never	Hours (1–12h typical)	Legal holds, tax records, raw data	Lowest	Highest per-GB fee

Retention Policies & Compliance

Many industries have strict data retention requirements enforced by law:

WORM (Write Once Read Many): Data cannot be modified or deleted for a specified retention period. S3 Object Lock and Azure Immutable Blob Storage implement WORM compliance.
Legal hold: Data is preserved indefinitely regardless of retention policies, typically in response to litigation or regulatory investigation.
GDPR: European regulation requiring data deletion upon request — tension with retention requirements that must be carefully managed.
HIPAA: Healthcare data must be retained for 6 years with audit trails.
SOX: Financial records must be retained for 7 years.

# Enable S3 Object Lock (WORM compliance) on a bucket
# Note: Object Lock must be enabled at bucket creation time

# Create a bucket with Object Lock enabled
aws s3api create-bucket \
  --bucket compliance-records-2026 \
  --region us-east-1 \
  --object-lock-enabled-for-bucket

# Set a default retention policy (7-year compliance mode)
aws s3api put-object-lock-configuration \
  --bucket compliance-records-2026 \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "COMPLIANCE",
        "Years": 7
      }
    }
  }'

# Upload an object (automatically gets the default retention)
aws s3 cp financial-report-2026.pdf \
  s3://compliance-records-2026/reports/

# Apply a legal hold to a specific object
aws s3api put-object-legal-hold \
  --bucket compliance-records-2026 \
  --key reports/financial-report-2026.pdf \
  --legal-hold '{"Status": "ON"}'

Backup Strategies: RPO and RTO

Two metrics define your backup and disaster recovery requirements:

RPO (Recovery Point Objective): The maximum acceptable amount of data loss measured in time. An RPO of 1 hour means you can afford to lose up to 1 hour of data — so you need backups at least every hour.
RTO (Recovery Time Objective): The maximum acceptable downtime. An RTO of 4 hours means the system must be back online within 4 hours of a failure.

                            
                            The Cost Relationship: Lower RPO and RTO = higher cost. Zero RPO (no data loss) requires synchronous replication to a standby. Zero RTO (instant failover) requires a hot standby running at all times. Most systems balance cost against acceptable risk.
                        

Strategy	RPO	RTO	Cost	Implementation
Backup & restore	Hours	Hours to days	$	Periodic snapshots to S3/Blob
Pilot light	Minutes	Hours	$$	Replicated data + minimal infra
Warm standby	Seconds to minutes	Minutes	$$$	Scaled-down replica running
Multi-site active-active	Zero (sync repl)	Zero (instant failover)	$$$$	Full infra in 2+ regions

Disaster Recovery with Storage

Storage is the foundation of any DR strategy. Key patterns:

Cross-region snapshots: EBS snapshots are automatically stored in S3 within the same region. Copy them cross-region for DR.
S3 Cross-Region Replication: Asynchronous replication of all objects to a bucket in another region.
Database-native replication: Most databases (RDS, Aurora, Cloud SQL) offer read replicas in other regions that can be promoted during failover.
Infrastructure as Code: With Terraform/CloudFormation stored in Git, you can recreate entire infrastructure stacks in a new region within minutes.

# Copy an EBS snapshot to another region for disaster recovery
# This creates a cross-region copy of a point-in-time snapshot

# Create a snapshot of the source volume
SNAPSHOT_ID=$(aws ec2 create-snapshot \
  --volume-id vol-0123456789abcdef0 \
  --description "DR backup $(date +%Y-%m-%d)" \
  --query 'SnapshotId' \
  --output text)

echo "Created snapshot: $SNAPSHOT_ID"

# Wait for the snapshot to complete
aws ec2 wait snapshot-completed --snapshot-ids "$SNAPSHOT_ID"
echo "Snapshot completed"

# Copy the snapshot to the DR region
DR_SNAPSHOT_ID=$(aws ec2 copy-snapshot \
  --source-region us-east-1 \
  --source-snapshot-id "$SNAPSHOT_ID" \
  --destination-region eu-west-1 \
  --description "DR copy of $SNAPSHOT_ID" \
  --encrypted \
  --query 'SnapshotId' \
  --output text \
  --region eu-west-1)

echo "DR snapshot copy initiated: $DR_SNAPSHOT_ID (eu-west-1)"

Hands-On Exercises

Exercise 1 20 min

Benchmark Local Disk Performance with fio

fio (Flexible I/O Tester) is the industry-standard tool for benchmarking storage performance. This exercise teaches you to measure IOPS, throughput, and latency for any block device.

# Install fio on Ubuntu/Debian
sudo apt update && sudo apt install -y fio

# Test 1: Random Read IOPS (simulates database workload)
# 4K block size, 64 parallel jobs, 30 seconds
fio --name=random-read-iops \
  --ioengine=libaio \
  --rw=randread \
  --bs=4k \
  --numjobs=64 \
  --iodepth=64 \
  --size=1G \
  --runtime=30 \
  --time_based \
  --directory=/tmp/fio-test \
  --group_reporting

# Test 2: Sequential Write Throughput (simulates backup/log writing)
# 1M block size, single job, 30 seconds
fio --name=seq-write-throughput \
  --ioengine=libaio \
  --rw=write \
  --bs=1M \
  --numjobs=1 \
  --iodepth=32 \
  --size=4G \
  --runtime=30 \
  --time_based \
  --directory=/tmp/fio-test \
  --group_reporting

# Test 3: Mixed Random Read/Write (simulates real application I/O)
# 70% read / 30% write mix, 8K block size
fio --name=mixed-rw \
  --ioengine=libaio \
  --rw=randrw \
  --rwmixread=70 \
  --bs=8k \
  --numjobs=16 \
  --iodepth=32 \
  --size=1G \
  --runtime=30 \
  --time_based \
  --directory=/tmp/fio-test \
  --group_reporting

# Clean up test files
rm -rf /tmp/fio-test

What to observe: Compare IOPS, bandwidth (BW), and latency (clat) across the three tests. Note how random 4K reads have high IOPS but low throughput, while sequential 1M writes have low IOPS but high throughput.

block storage benchmarking fio IOPS

Exercise 2 25 min

Create and Manage S3 Buckets with Lifecycle Policies

Practice creating S3 buckets, uploading objects, and configuring lifecycle policies to automatically transition data between storage tiers.

# Prerequisites: AWS CLI configured with valid credentials
# This exercise uses S3 free tier (5GB for 12 months)

# Step 1: Create a bucket with versioning enabled
BUCKET_NAME="storage-lab-$(date +%s)"
aws s3 mb "s3://$BUCKET_NAME" --region us-east-1
aws s3api put-bucket-versioning \
  --bucket "$BUCKET_NAME" \
  --versioning-configuration Status=Enabled

# Step 2: Enable default encryption (SSE-S3)
aws s3api put-bucket-encryption \
  --bucket "$BUCKET_NAME" \
  --server-side-encryption-configuration '{
    "Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]
  }'

# Step 3: Block all public access (security best practice)
aws s3api put-public-access-block \
  --bucket "$BUCKET_NAME" \
  --public-access-block-configuration \
  BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

# Step 4: Upload sample files to different prefixes
echo "Application log entry $(date)" > /tmp/sample-log.txt
echo '{"user": "demo", "action": "test"}' > /tmp/sample-data.json

aws s3 cp /tmp/sample-log.txt "s3://$BUCKET_NAME/logs/app.log"
aws s3 cp /tmp/sample-data.json "s3://$BUCKET_NAME/data/events.json"

# Step 5: Configure lifecycle policy
aws s3api put-bucket-lifecycle-configuration \
  --bucket "$BUCKET_NAME" \
  --lifecycle-configuration '{
    "Rules": [
      {
        "ID": "LogsLifecycle",
        "Status": "Enabled",
        "Filter": {"Prefix": "logs/"},
        "Transitions": [
          {"Days": 30, "StorageClass": "STANDARD_IA"},
          {"Days": 90, "StorageClass": "GLACIER_IR"}
        ],
        "Expiration": {"Days": 365}
      }
    ]
  }'

# Step 6: Verify everything
echo "Bucket: $BUCKET_NAME"
aws s3 ls "s3://$BUCKET_NAME/" --recursive --human-readable
aws s3api get-bucket-lifecycle-configuration --bucket "$BUCKET_NAME"

# Cleanup (uncomment when done)
# aws s3 rb "s3://$BUCKET_NAME" --force

What to observe: Check that versioning, encryption, public access block, and lifecycle policies are all correctly configured. This represents a production-ready bucket configuration.

object storage S3 lifecycle AWS CLI

Exercise 3 15 min

Storage Decision Matrix — Choose the Right Type

For each scenario below, determine the optimal storage type (block, object, or file), the cloud service, and the performance tier. Write your answers, then check against the solutions.

#	Scenario	Your Answer	Solution
1	PostgreSQL database for an e-commerce platform processing 5,000 transactions/second	Think first...	Block — AWS EBS io2 (or Azure Ultra Disk). Databases need low-latency random I/O with consistent IOPS.
2	Storing 500TB of security camera footage with 90-day retention	Think first...	Object — S3 Standard-IA with lifecycle to Glacier at 30 days, expiration at 90 days. Write-once, rare-read pattern.
3	WordPress media library shared across 20 web servers	Think first...	File — AWS EFS or Azure Files (SMB). Multiple servers need concurrent read/write access to the same files.
4	Machine learning training data — 10TB of CSV files processed weekly	Think first...	Object — S3 Standard or Intelligent-Tiering. Large sequential reads, HTTP API integrates with ML frameworks.

architecture decision making storage selection

Conclusion & Next Steps

Storage is the bedrock of infrastructure. In this article, we covered the full spectrum:

Hardware foundations: HDD, SSD, and NVMe — understanding the physical layer that everything else builds upon
Three paradigms: Block storage for databases, object storage for scale, and file storage for shared access
RAID: Redundancy and performance at the disk level — and why RAID is not a backup
Cloud services: EBS, S3, EFS across AWS, Azure, and GCP with their performance tiers and pricing
Protocols: iSCSI, NFS, SMB, and the S3 API that has become the universal standard
Architecture: Durability (11 nines), replication, encryption, and access control
Lifecycle management: Hot/warm/cold/archive tiers, RPO/RTO, and disaster recovery strategies

The key takeaway: choose storage based on your access pattern, not your vendor preference. Block for databases. Object for everything at scale. File for shared access. And always — always — have backups.

Next in the Series

In Part 7: Cloud Computing Fundamentals, we shift from physical and managed infrastructure to the cloud computing paradigm itself — IaaS, PaaS, SaaS, the shared responsibility model, and how AWS, Azure, and GCP organize their services.

Previous Part 5: Infrastructure Networking Next Part 7: Cloud Computing Fundamentals

Cookie Consent