Overview & History
Argo CD is a declarative, GitOps-based continuous delivery tool for Kubernetes. It monitors Git repositories containing application definitions and automatically synchronizes the desired state in Git with the live state in target clusters — making Git the single source of truth for infrastructure and application configuration.
Origins & CNCF Journey
Argo CD was originally developed at Intuit (the company behind TurboTax and QuickBooks) to manage thousands of microservices across multiple Kubernetes clusters. The timeline of its evolution:
- 2017 — Internal development begins at Intuit for managing Kubernetes deployments
- 2018 — Open-sourced under the Apache 2.0 license on GitHub
- 2020 — Accepted into the CNCF as an Incubating project (part of the Argo project family)
- 2022 — Graduated to CNCF top-level project status, joining Kubernetes, Prometheus, and Envoy
- 2026 — Over 17,000 GitHub stars, 500+ contributors, deployed at thousands of organizations
The Argo Ecosystem
Argo CD is part of a broader family of Kubernetes-native workflow tools:
- Argo CD — Continuous delivery and GitOps operator
- Argo Workflows — Kubernetes-native workflow engine for DAG-based job orchestration
- Argo Events — Event-driven automation (triggers from webhooks, queues, schedules)
- Argo Rollouts — Progressive delivery with canary and blue-green deployment strategies
Together, these tools form a complete GitOps platform — Argo CD handles deployment, Rollouts manages progressive delivery, Workflows orchestrates CI pipelines, and Events connects external triggers.
Architecture
Argo CD follows a microservices architecture deployed as a set of Kubernetes controllers and services. Understanding the internal components is essential for production tuning and troubleshooting.
flowchart TB
subgraph External["External Systems"]
Git["Git Repository"]
IdP["Identity Provider (SSO)"]
Webhook["Webhook Events"]
end
subgraph ArgoCD["Argo CD Components"]
API["API Server
(gRPC + REST)"]
Repo["Repo Server
(Manifest Generation)"]
Controller["Application Controller
(Reconciliation Loop)"]
Redis["Redis
(Cache)"]
Dex["Dex
(OIDC Connector)"]
end
subgraph Targets["Target Clusters"]
K8s1["Production Cluster"]
K8s2["Staging Cluster"]
K8s3["Dev Cluster"]
end
Git --> Repo
Webhook --> API
IdP --> Dex
Dex --> API
API --> Redis
API --> Repo
Controller --> Repo
Controller --> Redis
Controller --> K8s1
Controller --> K8s2
Controller --> K8s3
Component Responsibilities
| Component | Role | Scaling |
|---|---|---|
| API Server | Exposes gRPC/REST API, serves UI, handles authentication, enforces RBAC | Horizontal (stateless) |
| Repo Server | Clones Git repos, renders Helm/Kustomize/Jsonnet manifests, caches results | Horizontal (CPU-bound) |
| Application Controller | Watches desired state vs live state, triggers syncs, reports health | Sharding (by cluster) |
| Redis | Caches Git repository state, manifest render results, app state | Sentinel or HA mode |
| Dex | OIDC connector for SSO integration (LDAP, SAML, GitHub, Azure AD) | Single replica (light) |
Installation Methods
Plain Manifests (Quick Start)
The fastest way to install Argo CD for evaluation or development:
# Create namespace and install Argo CD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Wait for pods to be ready
kubectl wait --for=condition=Ready pods --all -n argocd --timeout=300s
# Get initial admin password
argocd admin initial-password -n argocd
# Port-forward to access the UI
kubectl port-forward svc/argocd-server -n argocd 8080:443
Helm Chart (Production)
For production deployments with customizable values:
# Add the Argo Helm repository
helm repo add argo https://argoproj.github.io/argo-helm
helm repo update
# Install with HA configuration
helm install argocd argo/argo-cd \
--namespace argocd \
--create-namespace \
--version 7.3.0 \
--values values-production.yaml
Example production values file:
# values-production.yaml
global:
image:
tag: "v2.12.0"
controller:
replicas: 2
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: "2"
memory: 2Gi
env:
- name: ARGOCD_CONTROLLER_REPLICAS
value: "2"
server:
replicas: 3
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
ingress:
enabled: true
ingressClassName: nginx
hosts:
- argocd.example.com
tls:
- secretName: argocd-tls
hosts:
- argocd.example.com
repoServer:
replicas: 3
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 8
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: "2"
memory: 1Gi
redis-ha:
enabled: true
haproxy:
enabled: true
dex:
enabled: true
Core Concepts
Argo CD introduces several abstractions for managing Kubernetes deployments declaratively:
flowchart LR
Project["AppProject
(Security Boundary)"]
App["Application
(Deployment Unit)"]
Repo["Repository
(Git Source)"]
Cluster["Cluster
(Target)"]
Project -->|"scopes"| App
App -->|"source"| Repo
App -->|"destination"| Cluster
Project -->|"allows repos"| Repo
Project -->|"allows clusters"| Cluster
Application
The fundamental unit of deployment in Argo CD. An Application defines a source (Git repo + path + revision) and a destination (cluster + namespace). It tracks the relationship between desired state in Git and live state in the cluster.
AppProject
A logical grouping mechanism that provides security boundaries — restricting which repositories, clusters, namespaces, and resource kinds an Application can use. The default project allows everything; production environments should use restrictive projects.
Sync & Health Status
| Status | Meaning |
|---|---|
| Synced | Live state matches desired state in Git |
| OutOfSync | Live state differs from desired state (drift detected) |
| Healthy | All resources are running and passing health checks |
| Degraded | Some resources are failing health checks |
| Progressing | Resources are being updated (rollout in progress) |
| Missing | Resources exist in Git but not in the cluster |
Prune & Self-Heal
Prune removes resources from the cluster that no longer exist in Git. Self-Heal automatically reverts manual changes made directly to the cluster, re-applying the desired state from Git. Both are opt-in per Application.
Application Management
Declarative Application Definition
The recommended approach is to define Applications as Kubernetes manifests stored in Git (the "app of apps" pattern):
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: guestbook
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://github.com/argoproj/argocd-example-apps.git
targetRevision: HEAD
path: guestbook
destination:
server: https://kubernetes.default.svc
namespace: guestbook
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
- PruneLast=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
The App of Apps Pattern
Create a "root" Application that points to a Git directory containing other Application manifests. This allows you to bootstrap an entire cluster's workloads from a single Application — each child Application manages one service or component independently.
Structure: apps/ directory contains Application YAMLs → root app watches apps/ → creates child apps → each child syncs its own service. Changes to any Application definition trigger reconciliation automatically.
Helm Application Source
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: prometheus-stack
namespace: argocd
spec:
project: monitoring
source:
repoURL: https://prometheus-community.github.io/helm-charts
chart: kube-prometheus-stack
targetRevision: 58.2.0
helm:
releaseName: prometheus
valuesObject:
grafana:
enabled: true
ingress:
enabled: true
hosts:
- grafana.example.com
alertmanager:
enabled: true
destination:
server: https://kubernetes.default.svc
namespace: monitoring
Sync Policies & Strategies
Auto-Sync Configuration
Auto-sync eliminates manual intervention by automatically applying changes when drift is detected:
- automated.prune: true — Delete resources removed from Git
- automated.selfHeal: true — Revert manual cluster changes
- automated.allowEmpty: false — Prevent syncing if Git source returns empty manifests (safety guard)
Sync Waves & Hooks
Sync waves control the order of resource creation. Resources with lower wave numbers are applied first:
# Namespace created first (wave -1)
apiVersion: v1
kind: Namespace
metadata:
name: my-app
annotations:
argocd.argoproj.io/sync-wave: "-1"
---
# ConfigMap created second (wave 0, default)
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
annotations:
argocd.argoproj.io/sync-wave: "0"
---
# Deployment created last (wave 1)
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
annotations:
argocd.argoproj.io/sync-wave: "1"
Resource Hooks
Hooks execute Jobs or other resources at specific points in the sync lifecycle:
| Hook | When It Runs | Use Case |
|---|---|---|
PreSync |
Before sync starts | Database migrations, backups |
Sync |
During sync (with waves) | Main deployment resources |
PostSync |
After all syncs succeed | Smoke tests, notifications |
SyncFail |
When sync fails | Cleanup, alerts, rollback triggers |
PreSync hooks for database schema migrations. This ensures your database is ready before new application code deploys — preventing crashes from schema mismatches during rollouts.
ApplicationSets
ApplicationSets automate the creation of multiple Applications from templates — essential for multi-cluster, multi-tenant, and monorepo deployments. They replace manual duplication with dynamic generation.
Git Generator
Creates an Application for each directory matching a pattern in a Git repository:
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: cluster-addons
namespace: argocd
spec:
goTemplate: true
goTemplateOptions: ["missingkey=error"]
generators:
- git:
repoURL: https://github.com/org/infra-config.git
revision: HEAD
directories:
- path: "addons/*"
template:
metadata:
name: "addon-{{.path.basename}}"
spec:
project: platform
source:
repoURL: https://github.com/org/infra-config.git
targetRevision: HEAD
path: "{{.path.path}}"
destination:
server: https://kubernetes.default.svc
namespace: "{{.path.basename}}"
syncPolicy:
automated:
prune: true
selfHeal: true
Cluster Generator
Deploys to every registered cluster (or a filtered subset):
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: monitoring-stack
namespace: argocd
spec:
goTemplate: true
generators:
- clusters:
selector:
matchLabels:
env: production
template:
metadata:
name: "monitoring-{{.name}}"
spec:
project: platform
source:
repoURL: https://github.com/org/monitoring.git
targetRevision: HEAD
path: overlays/production
destination:
server: "{{.server}}"
namespace: monitoring
Matrix Generator
Combines multiple generators via Cartesian product — for example, deploy every service to every cluster:
{
"generators": [
{
"matrix": {
"generators": [
{
"git": {
"repoURL": "https://github.com/org/apps.git",
"directories": [{"path": "services/*"}]
}
},
{
"clusters": {
"selector": {
"matchLabels": {"tier": "production"}
}
}
}
]
}
}
]
}
Preview Environments with PR Generator
The Pull Request generator creates ephemeral Applications for each open PR — enabling preview environments that are automatically provisioned when a PR is opened and destroyed when it's closed or merged.
This provides developers with isolated environments for testing changes before they reach the main branch, dramatically reducing the feedback loop. Combined with GitHub Actions or similar CI, the preview URL can be posted as a PR comment.
Multi-Cluster Management
Argo CD excels at managing applications across multiple Kubernetes clusters from a single control plane. Clusters are registered as Kubernetes Secrets in the argocd namespace.
Adding Clusters via CLI
# Add a cluster using the current kubeconfig context
argocd cluster add production-us-east-1 \
--name production-us-east \
--label env=production \
--label region=us-east-1
# List registered clusters
argocd cluster list
# Rotate cluster credentials
argocd cluster rotate-auth production-us-east-1
Cluster Secret Format
apiVersion: v1
kind: Secret
metadata:
name: production-cluster
namespace: argocd
labels:
argocd.argoproj.io/secret-type: cluster
env: production
region: us-east-1
type: Opaque
stringData:
name: production-us-east
server: https://k8s-api.prod.example.com:6443
config: |
{
"bearerToken": "",
"tlsClientConfig": {
"insecure": false,
"caData": ""
}
}
RBAC & Security
AppProject Scoping
AppProjects are the primary security boundary — restricting what Applications within the project can do:
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: team-payments
namespace: argocd
spec:
description: "Payment team applications"
sourceRepos:
- "https://github.com/org/payments-*"
destinations:
- namespace: "payments-*"
server: "https://kubernetes.default.svc"
- namespace: "payments-*"
server: "https://k8s-api.prod.example.com:6443"
clusterResourceWhitelist:
- group: ""
kind: Namespace
namespaceResourceBlacklist:
- group: ""
kind: ResourceQuota
- group: ""
kind: LimitRange
roles:
- name: developer
description: "Read-only access for developers"
policies:
- p, proj:team-payments:developer, applications, get, team-payments/*, allow
- p, proj:team-payments:developer, applications, sync, team-payments/*, allow
groups:
- payments-developers
- name: admin
description: "Full access for team leads"
policies:
- p, proj:team-payments:admin, applications, *, team-payments/*, allow
groups:
- payments-leads
RBAC Policy (ConfigMap)
Global RBAC policies are defined in the argocd-rbac-cm ConfigMap using Casbin policy format:
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-rbac-cm
namespace: argocd
data:
policy.default: role:readonly
policy.csv: |
# Roles
p, role:org-admin, applications, *, */*, allow
p, role:org-admin, clusters, *, *, allow
p, role:org-admin, repositories, *, *, allow
p, role:org-admin, projects, *, *, allow
p, role:developer, applications, get, */*, allow
p, role:developer, applications, sync, */*, allow
p, role:developer, logs, get, */*, allow
# Group bindings (SSO groups → roles)
g, platform-team, role:org-admin
g, developers, role:developer
scopes: "[groups, email]"
--audit-log-path on the API server. All API calls, syncs, and RBAC decisions are logged — essential for compliance in regulated industries (SOC2, HIPAA, PCI-DSS).
Production Hardening
High-Availability Deployment
For production environments managing hundreds of applications:
| Component | Replicas | CPU Request | Memory Request | Notes |
|---|---|---|---|---|
| API Server | 3+ | 250m | 256Mi | Behind load balancer |
| Repo Server | 3+ | 500m | 512Mi | CPU-intensive (rendering) |
| Controller | 2 | 1000m | 1Gi | Sharded by cluster |
| Redis | 3 (Sentinel) | 200m | 256Mi | HA with Sentinel |
Controller Sharding
For large-scale deployments (500+ applications or 10+ clusters), shard the Application Controller across multiple replicas:
# Set environment variables on the controller
env:
- name: ARGOCD_CONTROLLER_REPLICAS
value: "3"
# Each replica handles a subset of clusters
# Sharding is automatic based on consistent hashing
Backup & Disaster Recovery
# Export all Argo CD resources for backup
argocd admin export -n argocd > argocd-backup.yaml
# Import (restore) from backup
argocd admin import -n argocd < argocd-backup.yaml
# Backup only Application definitions
kubectl get applications -n argocd -o yaml > apps-backup.yaml
# Backup AppProjects
kubectl get appprojects -n argocd -o yaml > projects-backup.yaml
CLI Reference
Essential argocd CLI commands for daily operations:
| Command | Description |
|---|---|
argocd login <server> |
Authenticate with the Argo CD API server |
argocd app create <name> |
Create a new Application (imperative) |
argocd app get <name> |
Show Application details, sync/health status |
argocd app sync <name> |
Trigger a sync (apply Git state to cluster) |
argocd app diff <name> |
Show diff between Git and live state |
argocd app delete <name> |
Delete the Application (and optionally its resources) |
argocd app history <name> |
Show sync history (revisions deployed) |
argocd app rollback <name> <id> |
Rollback to a previous sync revision |
argocd app wait <name> |
Wait until app reaches synced + healthy state |
argocd cluster add <context> |
Register a new target cluster |
argocd repo add <url> |
Register a Git repository with credentials |
argocd proj create <name> |
Create a new AppProject |
argocd admin export |
Export all Argo CD data for backup |
Troubleshooting
Common operational issues and their resolutions:
Sync Failures
| Symptom | Cause | Resolution |
|---|---|---|
| ComparisonError | Repo Server cannot render manifests | Check repo-server logs, validate Helm/Kustomize locally |
| InvalidSpecError | Malformed Application spec | Validate YAML syntax, check source path exists |
| PermissionDenied | RBAC or AppProject restriction | Verify project allows the destination namespace/cluster |
| ResourceQuotaExceeded | Namespace quota hit during sync | Increase quota or reduce resource requests |
| ImagePullBackOff | Container image not accessible | Verify image exists, check registry credentials |
Controller OOM Issues
# Check controller memory usage
kubectl top pod -n argocd -l app.kubernetes.io/component=application-controller
# Increase memory limits
kubectl set resources deployment argocd-application-controller \
-n argocd --limits=memory=4Gi
# Enable controller sharding for large deployments
kubectl set env deployment/argocd-application-controller \
-n argocd ARGOCD_CONTROLLER_REPLICAS=3
Debugging Sync Operations
# View detailed sync status
argocd app get my-app --show-operation
# Check resource-level sync status
argocd app resources my-app
# View controller logs for a specific app
kubectl logs -n argocd -l app.kubernetes.io/component=application-controller \
--tail=100 | grep "my-app"
# Force a hard refresh (invalidate cache)
argocd app get my-app --hard-refresh
argocd app terminate-op <name> to cancel a stuck operation.