Terraform Best Practices and Patterns: Production-Grade Infrastructure Code
Terraform best practices and production patterns combine state architecture, module design, security controls, team workflows, cost management, and code conventions that enable reliable infrastructure management at any scale.
What You'll Learn
In this tutorial, you will learn industry-standard Terraform best practices including state architecture patterns, module design principles, naming conventions, security controls, cost optimization, team collaboration workflows, code review processes, and operational runbooks.
Why It Matters
What works for a single developer fails at team scale. Monolithic state files cause slow plans and wide blast radius. Unversioned modules create unpredictable changes. Missing security controls expose infrastructure. These patterns prevent the most common production Terraform failures.
Real-World Use
DodaTech's platform team manages 500 Terraform resources across 30 environments using these patterns. Durga Antivirus Pro achieves sub-minute plan times, zero state corruption, full audit compliance, and automated cost tracking through these production practices.
State Architecture Patterns
graph TD
subgraph "State Isolation Strategy"
ENV[Environment Layer]
ENV --> DEV[dev/terraform.tfstate]
ENV --> STG[staging/terraform.tfstate]
ENV --> PRD[production/terraform.tfstate]
SVC[Service Layer]
SVC --> NET[network/terraform.tfstate]
SVC --> DB[database/terraform.tfstate]
SVC --> CMP[compute/terraform.tfstate]
REG[Registry Layer]
REG --> S3[S3 Backend]
REG --> DDB[DynamoDB Locks]
REG --> KMS[KMS Encryption]
end
style DEV fill:#50c878,color:#fff
style PRD fill:#e74c3c,color:#fff
style NET fill:#4a90d9,color:#fff
style DB fill:#ff9900,color:#fff
State File Structure
terraform-state/
aws-account-123456789012/
dev/
network/terraform.tfstate
database/terraform.tfstate
compute/terraform.tfstate
staging/
network/terraform.tfstate
database/terraform.tfstate
compute/terraform.tfstate
production/
network/terraform.tfstate
database/terraform.tfstate
compute/terraform.tfstate
Module Design Patterns
Standard Module Interface
# modules/vpc/variables.tf
variable "name" {
description = "Resource name prefix"
type = string
validation {
condition = length(var.name) > 2
error_message = "Name must be at least 3 characters."
}
}
variable "environment" {
description = "Deployment environment (dev, staging, production)"
type = string
validation {
condition = contains(["dev", "staging", "production"], var.environment)
error_message = "Environment must be dev, staging, or production."
}
}
variable "tags" {
description = "Additional resource tags"
type = map(string)
default = {}
}
# modules/vpc/outputs.tf
output "vpc_id" {
description = "The VPC identifier"
value = aws_vpc.this.id
}
output "public_subnet_ids" {
description = "Public subnet identifiers"
value = aws_subnet.public[*].id
}
output "private_subnet_ids" {
description = "Private subnet identifiers"
value = aws_subnet.private[*].id
}
output "vpc_cidr" {
description = "The VPC CIDR block"
value = aws_vpc.this.cidr_block
}
# modules/vpc/main.tf
resource "aws_vpc" "this" {
cidr_block = var.cidr_block
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(var.tags, {
Name = "${var.name}-vpc"
Environment = var.environment
ManagedBy = "Terraform"
})
}
Module Composition with Data Flow
# production/main.tf
module "network" {
source = "github.com/dodatech/terraform-aws-vpc?ref=v2.1.0"
name = "production"
environment = "production"
cidr_block = "10.0.0.0/16"
tags = { CostCenter = "platform" }
}
module "database" {
source = "github.com/dodatech/terraform-aws-rds?ref=v1.3.0"
name = "main"
environment = "production"
vpc_id = module.network.vpc_id
subnet_ids = module.network.private_subnet_ids
tags = { CostCenter = "platform" }
}
module "application" {
source = "github.com/dodatech/terraform-aws-ecs?ref=v3.0.1"
name = "api"
environment = "production"
vpc_id = module.network.vpc_id
subnet_ids = module.network.public_subnet_ids
db_endpoint = module.database.endpoint
tags = { CostCenter = "platform" }
}
Expected output: Terraform provisions network first, database second, and application third. All modules receive validated, documented inputs. Tags propagate consistently.
Naming Conventions
Resource Naming Standards
# Consistent naming: {environment}-{service}-{resource}
resource "aws_s3_bucket" "logs" {
bucket = "dodatech-${var.environment}-logs"
}
resource "aws_db_instance" "main" {
identifier = "${var.environment}-${var.service}-db"
}
resource "aws_security_group" "web" {
name = "sg-${var.environment}-${var.service}-web"
}
Directory Structure Standard
terraform/
environments/
dev/
staging/
production/
modules/
networking/
database/
compute/
monitoring/
global/
iam/
route53/
scripts/
tests/
Security Patterns
Least-Privilege IAM
# iam-patterns.tf
data "aws_iam_policy_document" "terraform_execution" {
statement {
effect = "Allow"
actions = [
"ec2:Describe*",
"ec2:Create*",
"ec2:Delete*",
"s3:Get*",
"s3:Put*",
"s3:List*",
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:DeleteItem]
]
resources = ["*"]
}
}
# Prevent destruction of critical resources
resource "aws_db_instance" "critical" {
lifecycle {
prevent_destroy = true
}
}
resource "aws_s3_bucket" "state" {
lifecycle {
prevent_destroy = true
}
}
Secrets Management
# secrets.tf
resource "aws_secretsmanager_secret" "db_password" {
name = "${var.environment}-db-password"
}
resource "aws_secretsmanager_secret_version" "db_password" {
secret_id = aws_secretsmanager_secret.db_password.id
secret_string = random_password.db.result
}
Operational Patterns
Drift Detection
#!/bin/bash
# drift-detection.sh
ENVIRONMENTS=("dev" "staging" "production")
for ENV in "${ENVIRONMENTS[@]}"; do
cd "environments/$ENV"
terraform init -backend-config=backend.hcl
terraform plan -no-color -detailed-exitcode
case $? in
0) echo "$ENV: No drift detected" ;;
2) echo "ALERT: Drift detected in $ENV" | slack-notify ;;
*) echo "$ENV: Error running plan" ;;
esac
done
Cost Estimation
# .github/workflows/infracost.yml
- name: Run Infracost
uses: infracost/actions/setup@v3
with:
api-key: ${{ secrets.INFRACOST_API_KEY }}
- name: Generate Cost Estimate
run: |
infracost breakdown --path . \
--format=diff \
--show-skipped \
--terraform-plan-flags="-out=plan.tfplan"
Common Mistakes
1. Monolithic State Files
A single state for everything causes 5-minute plan times and pan-team lock contention. Split by environment and service.
2. Unversioned Module Sources
Using source = "./modules/vpc" means any module change affects all consumers immediately. Always version modules.
3. Ignoring Terraform Lock File
The .<a href="/devops/terraform/">terraform</a>.lock.hcl file pins provider versions. Commit it to Git and never delete it.
4. No Pre-Apply Plan Review
Applying without reviewing the plan ignores destructive changes. Always require plan approval for production.
5. Missing Lifecycle Rules
Resources like databases and state buckets need prevent_destroy to guard against accidental deletion.
Practice Questions
1. What is the recommended Terraform state architecture for production? Separate state files per environment and per service, stored in an encrypted S3 backend with DynamoDB locking.
2. Why should modules use semantic versioning? Versioning ensures consumers opt into changes explicitly. Breaking changes are communicated through major version bumps.
3. What three lifecycle rules should every production Terraform configuration include?
prevent_destroy on critical resources, create_before_destroy on replacements, and ignore_changes on auto-managed attributes.
4. Challenge: Design a complete Terraform repository structure for a multi-service, multi-environment deployment with module versioning, remote state, CI/CD pipelines, drift detection, and cost estimation -- then implement the state backend infrastructure.
Mini Project: Production Terraform Repository
Create a complete Terraform repository with: separate directories per environment (dev, staging, production), separate state files per environment and service, a module registry with semantic versioning, a CI/CD pipeline with plan-only PR checks and approval gates, drift detection as a cron job, Infracost cost estimation, and Sentinel policy enforcement.
Related Concepts
What's Next
Apply Terraform best practices to your infrastructure code, then study Production Patterns for advanced operational strategies. Explore DevOps workflows for team-scale infrastructure management.
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro