Skip to content

Kubernetes Logging: EFK Stack (Elasticsearch, Fluentd, Kibana)

DodaTech 3 min read

In this tutorial, you'll learn about Kubernetes Logging: EFK Stack (Elasticsearch, Fluentd, Kibana). We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

The EFK stack Elasticsearch Fluentd and Kibana provides centralized logging for Kubernetes clusters by collecting container logs from every node and making them searchable.

What You'll Learn

This tutorial covers deploying Fluentd as a DaemonSet for log collection, configuring Elasticsearch for storage, setting up Kibana for dashboards, structured logging with JSON, and log retention policies.

Why It Matters

Without centralized logging, debugging production issues requires SSH-ing into nodes and searching through individual log files. Centralized logging reduces mean time to resolution from hours to minutes.

Real-World Use

DigitalOcean uses the EFK stack to aggregate logs from hundreds of thousands of containers across their Kubernetes platform. Wix uses Fluentd to collect over 10 terabytes of logs daily.

Fluentd DaemonSet Deployment

Fluentd runs as a DaemonSet, collecting logs from each node.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: logging
spec:
  selector:
    matchLabels:
      name: fluentd
  template:
    metadata:
      labels:
        name: fluentd
    spec:
      serviceAccountName: fluentd
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.17-debian-elasticsearch
        env:
        - name: FLUENT_ELASTICSEARCH_HOST
          value: "elasticsearch.logging.svc.cluster.local"
        - name: FLUENT_ELASTICSEARCH_PORT
          value: "9200"
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: dockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: dockercontainers
        hostPath:
          path: /var/lib/docker/containers
# Create logging namespace
kubectl create namespace logging

# Deploy Fluentd
kubectl apply -f fluentd-daemonset.yaml

# Verify Fluentd pods on each node
kubectl -n logging get pods -o wide

Fluentd Configuration

Fluentd configuration defines input sources, filters, and output destinations.

<source>
  @type tail
  path /var/log/containers/*.log
  pos_file /var/log/fluentd-containers.log.pos
  tag kubernetes.*
  read_from_head true
  <parse>
    @type json
    time_key time
    time_format %Y-%m-%dT%H:%M:%S.%NZ
  </parse>
</source>

<filter kubernetes.**>
  @type kubernetes_metadata
</filter>

<match kubernetes.**>
  @type elasticsearch
  host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
  port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
  logstash_format true
  logstash_prefix kubernetes-logs
</match>

Elasticsearch Deployment

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: logging
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    spec:
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
        env:
        - name: discovery.type
          value: "single-node"
        - name: ES_JAVA_OPTS
          value: "-Xms2g -Xmx2g"
        ports:
        - containerPort: 9200
        volumeClaimTemplates:
        - metadata:
            name: data
          spec:
            accessModes: ["ReadWriteOnce"]
            resources:
              requests:
                storage: 100Gi

Configuring Index Lifecycle Management

Prevent Elasticsearch from running out of disk space.

# Create ILM policy
PUT _ilm/policy/kubernetes-logs-policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50GB",
            "max_age": "1d"
          }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

Kibana Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: logging
spec:
  replicas: 1
  template:
    spec:
      containers:
      - name: kibana
        image: docker.elastic.co/kibana/kibana:8.11.0
        env:
        - name: ELASTICSEARCH_HOSTS
          value: "http://elasticsearch:9200"
        ports:
        - containerPort: 5601
# Access Kibana
kubectl -n logging port-forward deployment/kibana 5601:5601

Open http://localhost:5601 in your browser and configure the Kubernetes-logs-* index pattern.

Practice Questions

  1. Why is Fluentd deployed as a DaemonSet? It collects logs from every node in the cluster, so one pod per node is required.

  2. What is the purpose of the Fluentd pos_file? It tracks the current read position in log files to prevent duplicate reads after restarts.

  3. How does Elasticsearch prevent disk exhaustion? Index Lifecycle Management policies automatically delete indices older than the retention period.

  4. What does the Kubernetes_metadata filter do in Fluentd? It enriches log entries with Kubernetes metadata like pod name, namespace, and labels.

  5. How do you search logs for a specific pod namespace? Use the Kubernetes.namespace_name field in Kibana to filter logs by namespace.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro