Introduction — What Are Artifacts?
In Part 12, we covered how build systems transform source code into deployable software. But the build is only half the story. Once you have a compiled binary, a Docker image, or a packaged library — where does it go? How do you store it, version it, promote it through environments, and prove it hasn't been tampered with?
An artifact is any output of a build process that is intended for deployment or consumption by other systems. This includes Docker images, JAR files, npm packages, Go binaries, Helm charts, machine learning models, and documentation bundles.
Why Artifact Management Matters
Without proper artifact management, organisations face:
- Traceability gaps — "Which version is running in production?" becomes unanswerable
- Supply chain attacks — Unverified artifacts could contain malicious code
- Wasted compute — Rebuilding the same artifact for every environment
- Rollback failures — Can't revert to a known-good version if old artifacts are deleted
- Compliance violations — Auditors require proof of what was deployed and when
Types of Artifacts
| Artifact Type | Format | Registry/Repository | Example |
|---|---|---|---|
| Container Image | OCI image | Docker Hub, ECR, GCR, ACR | myapp:v2.1.0 |
| Java Package | .jar / .war | Maven Central, Nexus | mylib-1.3.0.jar |
| npm Package | .tgz | npmjs.com, GitHub Packages | @org/utils-2.0.0.tgz |
| Python Package | .whl / .tar.gz | PyPI, Artifactory | mylib-1.0.0-py3-none-any.whl |
| Go Binary | Static binary | GitHub Releases, GCS | server-linux-amd64 |
| Helm Chart | .tgz | ChartMuseum, OCI registries | myapp-chart-1.2.0.tgz |
The Immutability Principle
Once an artifact is published with a version tag, it must never be modified. If you need to fix a bug, you publish a new version — you never overwrite an existing one. This principle ensures that:
- Rollbacks are always possible (the previous version still exists)
- Audits are meaningful (version 2.1.0 always means the same thing)
- Caching works correctly (same tag = same content, always)
- Reproducibility is guaranteed (rebuilding from the same source yields the same artifact)
Artifact Repositories
An artifact repository is a server that stores, indexes, and serves build artifacts. It acts as the single source of truth for all deployable software in your organisation.
Repository Comparison
| Repository | Type | Formats Supported | Best For |
|---|---|---|---|
| JFrog Artifactory | Universal | All (Docker, Maven, npm, PyPI, Go, Helm, etc.) | Enterprise, multi-format |
| Sonatype Nexus | Universal | Docker, Maven, npm, PyPI, NuGet, Go | Self-hosted, OSS option |
| GitHub Packages | Cloud | Docker, npm, Maven, NuGet, RubyGems | GitHub-native workflows |
| AWS ECR | Cloud | OCI images only | AWS-native deployments |
| Google Artifact Registry | Cloud | Docker, Maven, npm, Python, Go, Apt | GCP-native, multi-format |
| Azure Container Registry | Cloud | OCI images, Helm charts | Azure-native deployments |
Container Registries
Container registries are specialised artifact repositories for OCI (Docker) images. They handle layer deduplication, manifest management, and multi-architecture image support.
# Building and pushing a Docker image to a registry
# Build the image with proper tagging
docker build -t mycompany/api-server:2.1.0 .
# Tag for a specific registry
docker tag mycompany/api-server:2.1.0 \
ghcr.io/mycompany/api-server:2.1.0
# Authenticate to the registry
echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
# Push to the registry
docker push ghcr.io/mycompany/api-server:2.1.0
# Pull from the registry (on another machine)
docker pull ghcr.io/mycompany/api-server:2.1.0
# Inspect image without pulling (check manifest)
docker manifest inspect ghcr.io/mycompany/api-server:2.1.0
# Multi-architecture builds (ARM64 + AMD64)
# Create a buildx builder
docker buildx create --name multiarch --use
# Build for multiple platforms simultaneously
docker buildx build \
--platform linux/amd64,linux/arm64 \
--tag ghcr.io/mycompany/api-server:2.1.0 \
--push \
.
# Verify the manifest includes both architectures
docker manifest inspect ghcr.io/mycompany/api-server:2.1.0
# Shows: linux/amd64 and linux/arm64 digests
Image Tagging Strategies
Why "latest" is Dangerous
:latest in production deployments. The "latest" tag is mutable — it points to whatever was last pushed. This means: (1) you can't tell what version is running, (2) different pods might pull different versions, (3) rollbacks are impossible because "latest" has already been overwritten.
Best Practices for Image Tags
flowchart TD
A[Git Commit] --> B[CI Pipeline]
B --> C[Build Image]
C --> D[Tag: git SHA]
C --> E[Tag: SemVer]
C --> F[Tag: branch-buildnum]
D --> G[Push to Registry]
E --> G
F --> G
G --> H{Environment}
H --> I["Dev: branch-buildnum"]
H --> J["Staging: git SHA"]
H --> K["Production: SemVer"]
style K fill:#BF092F,color:#fff
style J fill:#3B9797,color:#fff
style I fill:#16476A,color:#fff
| Strategy | Example | Pros | Cons |
|---|---|---|---|
| Git SHA | myapp:a1b2c3d |
Unique, traceable to exact commit | Not human-readable |
| SemVer | myapp:2.1.0 |
Clear version communication | Requires manual bump |
| Build number | myapp:build-4521 |
Auto-incrementing, unique | No semantic meaning |
| Combined | myapp:2.1.0-a1b2c3d |
Best of both worlds | Longer tag names |
# Recommended: Tag with both SemVer AND git SHA
VERSION="2.1.0"
GIT_SHA=$(git rev-parse --short HEAD)
BUILD_DATE=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
docker build \
--label "org.opencontainers.image.version=${VERSION}" \
--label "org.opencontainers.image.revision=${GIT_SHA}" \
--label "org.opencontainers.image.created=${BUILD_DATE}" \
-t "mycompany/api:${VERSION}" \
-t "mycompany/api:${VERSION}-${GIT_SHA}" \
-t "mycompany/api:${GIT_SHA}" \
.
# All three tags point to the same image digest
# Use SemVer for human reference, SHA for exact traceability
echo "Tagged: ${VERSION}, ${VERSION}-${GIT_SHA}, ${GIT_SHA}"
SBOM — Software Bill of Materials
A Software Bill of Materials (SBOM) is a complete, machine-readable inventory of all components in a software artifact — every library, every dependency, every version. Think of it as a "nutrition label" for software.
Why SBOMs Are Becoming Mandatory
US Executive Order 14028 (May 2021) requires SBOMs for all software sold to the federal government. The EU Cyber Resilience Act extends similar requirements to all software sold in Europe. This isn't optional anymore — it's compliance.
SBOM Formats
| Format | Organisation | Strengths | Use Case |
|---|---|---|---|
| SPDX | Linux Foundation | ISO standard, license focus | License compliance, legal |
| CycloneDX | OWASP | Security focus, VEX support | Vulnerability management |
# Generate SBOM using Syft (by Anchore)
# Install Syft
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s
# Generate SBOM for a Docker image (CycloneDX format)
syft ghcr.io/mycompany/api:2.1.0 -o cyclonedx-json > sbom.json
# Generate SBOM for a local directory (SPDX format)
syft dir:./my-project -o spdx-json > sbom-spdx.json
# Generate SBOM using Trivy (combines SBOM + vulnerability scan)
trivy image --format cyclonedx \
--output sbom.json \
ghcr.io/mycompany/api:2.1.0
# Attach SBOM to image using Cosign
cosign attach sbom --sbom sbom.json \
ghcr.io/mycompany/api:2.1.0
echo "SBOM generated and attached to image"
Build Provenance & SLSA Framework
SLSA (Supply-chain Levels for Software Artifacts, pronounced "salsa") is a security framework that defines levels of supply chain integrity. Each level adds stronger guarantees about how an artifact was produced.
flowchart TD
A["SLSA Level 0
No guarantees"] --> B["SLSA Level 1
Build provenance exists"]
B --> C["SLSA Level 2
Hosted build service"]
C --> D["SLSA Level 3
Hardened build platform"]
A --- E["Anyone could have built this"]
B --- F["We know WHO built it and HOW"]
C --- G["Build ran on a trusted platform"]
D --- H["Tamper-proof, isolated builds"]
style A fill:#666,color:#fff
style B fill:#16476A,color:#fff
style C fill:#3B9797,color:#fff
style D fill:#BF092F,color:#fff
| SLSA Level | Requirements | What It Proves |
|---|---|---|
| Level 0 | None | Nothing (no provenance) |
| Level 1 | Provenance generated, any build system | Package has provenance showing how it was built |
| Level 2 | Hosted build, signed provenance | Provenance was generated by a trusted build service |
| Level 3 | Hardened platform, isolated build | Build was tamper-proof; no one could have modified it |
Signing Artifacts with Sigstore/Cosign
# Cosign: Keyless signing for container images (uses Sigstore)
# Install cosign
go install github.com/sigstore/cosign/v2/cmd/cosign@latest
# Sign an image (keyless — uses OIDC identity from GitHub/Google)
cosign sign ghcr.io/mycompany/api:2.1.0
# This creates a signature in the Rekor transparency log
# Verify a signed image
cosign verify ghcr.io/mycompany/api:2.1.0 \
--certificate-identity="https://github.com/mycompany/api/.github/workflows/build.yml@refs/tags/v2.1.0" \
--certificate-oidc-issuer="https://token.actions.githubusercontent.com"
# Attach provenance attestation (SLSA)
cosign attest --predicate provenance.json \
--type slsaprovenance \
ghcr.io/mycompany/api:2.1.0
echo "Image signed and attested with SLSA provenance"
# GitHub Actions: Generate SLSA provenance automatically
name: Build and Attest
on:
push:
tags: ['v*']
jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
id-token: write # Required for keyless signing
attestations: write
steps:
- uses: actions/checkout@v4
- name: Build Docker image
run: |
docker build -t ghcr.io/${{ github.repository }}:${{ github.ref_name }} .
- name: Push to GHCR
run: |
echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin
docker push ghcr.io/${{ github.repository }}:${{ github.ref_name }}
- name: Generate SBOM
uses: anchore/sbom-action@v0
with:
image: ghcr.io/${{ github.repository }}:${{ github.ref_name }}
format: cyclonedx-json
output-file: sbom.json
- name: Sign with Cosign
uses: sigstore/cosign-installer@v3
- run: |
cosign sign ghcr.io/${{ github.repository }}:${{ github.ref_name }}
cosign attach sbom --sbom sbom.json ghcr.io/${{ github.repository }}:${{ github.ref_name }}
SolarWinds — Why Build Provenance Matters
In December 2020, it was revealed that Russian state-sponsored hackers had compromised the SolarWinds build system. They injected malicious code into the Orion software update, which was then signed with SolarWinds' legitimate certificate and distributed to 18,000+ organisations, including US government agencies.
What SLSA Level 3 would have prevented:
- Isolated, hermetic builds would have detected the injected code (it wasn't in source control)
- Build provenance attestation would have shown the artifact didn't match the expected source
- Tamper-proof build logs would have recorded the modification
- Verifiable provenance would have allowed consumers to validate the build chain
This attack was the catalyst for SLSA's creation and Executive Order 14028's requirements.
Artifact Promotion
Artifact promotion is the practice of moving a single immutable artifact through environments without rebuilding. The same Docker image that passed tests in staging is the same image that runs in production — byte for byte.
flowchart LR
A[Build] --> B[Dev Registry]
B --> C{Tests Pass?}
C -->|Yes| D[Staging Registry]
D --> E{QA Approval?}
E -->|Yes| F[Production Registry]
C -->|No| G[Rejected]
E -->|No| G
style A fill:#132440,color:#fff
style B fill:#16476A,color:#fff
style D fill:#3B9797,color:#fff
style F fill:#BF092F,color:#fff
style G fill:#666,color:#fff
# Artifact promotion: Copy image between registries (never rebuild!)
# Promote from dev to staging
crane copy \
dev-registry.company.com/api:2.1.0-a1b2c3d \
staging-registry.company.com/api:2.1.0-a1b2c3d
# Promote from staging to production (after QA approval)
crane copy \
staging-registry.company.com/api:2.1.0-a1b2c3d \
prod-registry.company.com/api:2.1.0
# Verify the digests match (proof of immutability)
crane digest dev-registry.company.com/api:2.1.0-a1b2c3d
crane digest prod-registry.company.com/api:2.1.0
# Both should output: sha256:abc123...
echo "Same artifact, promoted through environments without rebuild"
Netflix's Artifact Promotion Model
Netflix builds each service artifact once, tags it with a unique identifier, and promotes it through their pipeline: Build → Test → Canary → Regional → Global. The same AMI (Amazon Machine Image) that passes integration tests is the same one deployed to production regions worldwide.
Their "Spinnaker" deployment platform (now open source) manages promotion gates: automated test results, canary analysis scores, and manual approvals. An artifact can only advance if all gates pass — and it's the same binary at every stage.
Cleanup & Retention Policies
Container registries and artifact repositories accumulate storage rapidly. Without cleanup policies, costs grow unbounded and registries become difficult to navigate.
Retention Policy Guidelines
| Artifact Category | Retention Period | Rationale |
|---|---|---|
| Production releases | Indefinite (or 2+ years) | Rollback capability, audit compliance |
| Staging artifacts | 90 days | Debugging recent issues |
| Dev/feature branch builds | 14-30 days | Temporary development builds |
| Untagged images | 7 days | Intermediate build layers |
# ECR lifecycle policy (AWS) — automatically delete old images
cat <<'EOF'
{
"rules": [
{
"rulePriority": 1,
"description": "Keep last 10 production images",
"selection": {
"tagStatus": "tagged",
"tagPrefixList": ["v"],
"countType": "imageCountMoreThan",
"countNumber": 10
},
"action": { "type": "expire" }
},
{
"rulePriority": 2,
"description": "Delete untagged images after 7 days",
"selection": {
"tagStatus": "untagged",
"countType": "sinceImagePushed",
"countUnit": "days",
"countNumber": 7
},
"action": { "type": "expire" }
},
{
"rulePriority": 3,
"description": "Delete dev images after 30 days",
"selection": {
"tagStatus": "tagged",
"tagPrefixList": ["dev-", "feature-"],
"countType": "sinceImagePushed",
"countUnit": "days",
"countNumber": 30
},
"action": { "type": "expire" }
}
]
}
EOF
echo "ECR lifecycle policy configured"
Exercises
node:20-alpine). Generate an SBOM using Syft or Trivy. How many packages does it contain? Are there any known vulnerabilities? Would you be comfortable deploying this image to a system processing financial data? Why or why not?
Conclusion & Next Steps
Artifact management is the bridge between "code that builds" and "software that runs in production." The key principles are deceptively simple: build once, tag immutably, promote through environments, sign everything, and know exactly what's inside your artifacts (SBOMs). But implementing them properly requires discipline and tooling.
The supply chain security landscape is evolving rapidly. SLSA, Sigstore, and mandatory SBOMs are moving from "nice to have" to "required for business." Organisations that invest in build provenance now will have a significant compliance advantage as regulations tighten.
Next in the Series
In Part 14: Continuous Integration — Pipelines & Automation, we'll explore how to design CI pipelines that catch bugs in minutes — covering GitHub Actions, GitLab CI, Jenkins, pipeline-as-code, parallelisation, caching, and the practices that make CI fast and reliable.