Nomad â Simple Workload Orchestration Guide
In this tutorial, you'll learn about Nomad. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
HashiCorp Nomad is a simple, flexible orchestrator that deploys and manages containerized and non-containerized applications across a fleet of machines, enabling bin packing, rolling updates, and multi-region scheduling.
What You'll Learn
Why It Matters
Kubernetes is powerful but complex. For many workloads â batch jobs, Java applications, legacy services, or simple web apps â Nomad provides a simpler alternative with a single binary, a straightforward job specification, and deep integration with Consul and Vault. DodaTech runs 40% of stateless workloads on Nomad alongside Kubernetes, simplifying operations for services that do not need Kubernetes' container Orchestration features.
Real-World Use
DodaZIP's image processing pipeline runs on Nomad. Each processing task is a batch job that downloads source images, processes them with FFmpeg and ImageMagick, and uploads results to S3. Nomad schedules these jobs across the cluster based on resource availability, retries failures, and provides a simple CLI for monitoring.
flowchart TD
A[Nomad Job Spec] --> B[Nomad Server Cluster]
B --> C[Nomad Client 1]
B --> D[Nomad Client 2]
B --> E[Nomad Client N]
C --> F[Allocation: Web App]
C --> G[Allocation: Batch Job]
D --> H[Allocation: API Service]
E --> I[Allocation: Worker]
B --> J[Consul Integration]
B --> K[Vault Integration]
J --> L[Service Discovery]
K --> M[Secret Injection]
style B fill:#00CA6B,color:#fff
Prerequisites: Basic Linux administration. Understanding of Docker or containerization.
Installation
# Install Nomad
wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor | sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install nomad
# Start Nomad in dev mode
nomad agent -dev -bind 0.0.0.0 -log-level INFO
# Expected output:
# ==> Starting Nomad agent...
# ==> Nomad agent configuration:
# Version: 1.8.0
# Client: true
# Server: true
# ==> Nomad agent started! Log data will stream below.
# ==> Nomad HTTP server started: http://0.0.0.0:4646
# ==> Nomad started in dev mode. No persistent data.
# Verify
nomad server members
# Expected output:
# Name Address Port Status Leader Protocol Build Datacenter
# nomad-server.global 127.0.0.1 4648 alive true 2 1.8.0 dc1
nomad node status
# Expected output:
# ID DC Name Class Drain Eligibility Status
# abc123 dc1 nomad-server <none> false eligible ready
Server Configuration
# /etc/nomad.d/server.hcl
datacenter = "dc1"
data_dir = "/opt/nomad"
server {
enabled = true
bootstrap_expect = 3
server_join {
retry_join = ["nomad-01.dodatech.com:4648", "nomad-02.dodatech.com:4648", "nomad-03.dodatech.com:4648"]
}
encrypt = "aPu1gZxV7VJqVfL5kH4wYQ=="
}
client {
enabled = true
options {
docker.privileged.enabled = "false"
driver.raw_exec.enable = "1"
}
host_volume "data" {
path = "/opt/nomad/data"
read_only = false
}
}
consul {
address = "consul.dodatech.com:8500"
token = "consul-token-here"
auto_advertise = true
server_service_name = "nomad-server"
client_service_name = "nomad-client"
}
vault {
enabled = true
address = "https://vault.dodatech.com:8200"
token = "vault-token-here"
create_from_role = "nomad-cluster"
}
telemetry {
prometheus_metrics = true
publish_allocation_metrics = true
publish_node_metrics = true
}
Job Specification
# dodazip-web.nomad
job "dodazip-web" {
datacenters = ["dc1"]
type = "service"
group "web" {
count = 3
network {
port "http" {
to = 8080
}
}
service {
name = "dodazip-web"
port = "http"
tags = ["api", "production"]
check {
type = "http"
path = "/health"
interval = "10s"
timeout = "2s"
}
connect {
sidecar_service {}
}
}
task "dodazip" {
driver = "docker"
config {
image = "registry.dodatech.com/dodazip:2.5.0"
ports = ["http"]
health_checks {
type = "http"
path = "/health"
interval = 10
timeout = 2
}
}
env {
NODE_ENV = "production"
LOG_LEVEL = "info"
DB_HOST = "postgres.service.consul"
}
resources {
cpu = 500
memory = 256
}
template {
data = <<EOH
DB_USERNAME="{{ with secret "database/creds/dodazip-role" }}{{ .Data.username }}{{ end }}"
DB_PASSWORD="{{ with secret "database/creds/dodazip-role" }}{{ .Data.password }}{{ end }}"
EOH
destination = "secrets/db.env"
env = true
}
vault {
policies = ["dodazip-readonly"]
}
}
}
update {
max_parallel = 1
canary = 1
auto_promote = true
health_check = "checks"
progress_deadline = "10m"
}
}
Batch Job
# image-processor.nomad
job "image-processor" {
datacenters = ["dc1"]
type = "batch"
group "process" {
count = 5
task "convert" {
driver = "docker"
config {
image = "registry.dodatech.com/image-processor:1.0"
args = ["--input", "${NOMAD_ALLOC_DIR}/input", "--output", "${NOMAD_ALLOC_DIR}/output"]
}
resources {
cpu = 1000
memory = 512
}
dispatch_payload {
file = "payload.json"
}
}
restart {
attempts = 3
interval = "30m"
delay = "15s"
mode = "delay"
}
reschedule {
attempts = 5
interval = "1h"
delay = "30s"
delay_function = "exponential"
}
}
}
Running and Managing Jobs
# Run a service job
nomad run dodazip-web.nomad
# Expected output:
# ==> Monitoring evaluation "abc123"
# Evaluation triggered by job "dodazip-web"
# Allocation "alloc1" created: node "node1", group "web"
# Allocation "alloc2" created: node "node2", group "web"
# Allocation "alloc3" created: node "node3", group "web"
# ==> Evaluation status changed: "complete" (3/3 allocations ready)
# Check job status
nomad status dodazip-web
# Expected output:
# Name = dodazip-web
# Type = service
# Status = running
# Periodic = false
# Datacenters = dc1
# Allocations:
# ID Node Task Desired Status Created
# alloc1 node1 dodazip run running 1m ago
# alloc2 node2 dodazip run running 1m ago
# alloc3 node3 dodazip run running 1m ago
# View allocation logs
nomad logs -f alloc1
# Run a batch job with dispatch
nomad job dispatch -meta key=value image-processor
# Stop a job
nomad stop dodazip-web
# Expected output:
# ==> Monitoring evaluation "def456"
# Evaluation triggered by job "dodazip-web"
# Evaluation status changed: "complete"
# Plan a change (dry run)
nomad plan dodazip-web.nomad
# Expected output:
# + Job: "dodazip-web"
# + Task Group: "web" (3 create)
# Scheduler dry-run:
# - All tasks successfully allocated.
Multi-Region Federation
# /etc/nomad.d/federation.hcl
server {
enabled = true
server_join {
retry_join = ["nomad-us-east-01:4648", "nomad-eu-west-01:4648"]
}
}
# Multi-region job
job "global-dodazip" {
datacenters = ["dc1", "dc2", "dc3"]
group "web" {
count = 2
task "dodazip" {
driver = "docker"
config {
image = "registry.dodatech.com/dodazip:2.5.0"
}
}
}
}
Common Configuration Mistakes
Not setting resource limits on tasks: A task without CPU/memory limits can starve other allocations on the same node. Always specify
resources { cpu = 500; memory = 256 }.Using
raw_execdriver without security considerations:raw_execruns tasks directly without container isolation. Use it only for trusted workloads and enabledriver.raw_exec.enableexplicitly.Missing
networkblock for service jobs: Jobs that serve traffic need anetworkblock with ports. Without it, the service cannot receive traffic from other nodes.Not configuring
updateblock for service jobs: Without rolling update settings, changes cause all instances to restart simultaneously, causing downtime. Usemax_parallel = 1with health checks.Forgetting Vault token renewal: Nomad can renew Vault tokens, but the token must have sufficient TTL. Use Vault's
nomad-clusterrole with appropriateperiodfor automatic renewal.
Practice Questions
What is the difference between
serviceandbatchjob types in Nomad? Answer: Service jobs run continuously and Nomad keeps the desired count running. Batch jobs run to completion â Nomad restarts them on failure up to the restart attempt limit.How does Nomad integrate with Consul? Answer: Nomad automatically registers service jobs with Consul for DNS-based discovery. Health checks defined in Nomad become Consul health checks.
What is a Nomad allocation? Answer: An allocation is a mapping between a task group in a job and a client node where it runs. It includes the resources assigned, task states, and network ports.
How does Nomad handle rolling updates? Answer: The
updateblock controls rolling updates withmax_parallel,canary,auto_promote, andhealth_checkparameters, gradually replacing allocations with zero downtime.
Challenge
Deploy a production Nomad cluster: set up 3 servers with Raft consensus, configure Consul integration for service discovery and Vault integration for secrets, write a service job for a web application with Docker driver, health checks, resource limits, and rolling update Strategy, write a batch job for a data processing pipeline with dispatch payload and retry logic, test canary deployments, and federate two Nomad datacenters across regions.
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro