Skip to content

Monitoring Kubernetes: kube-state-metrics and cAdvisor

DodaTech Updated 2026-06-23 6 min read

In this tutorial, you'll learn about Monitoring Kubernetes: kube. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

What You Will Learn

This tutorial teaches you how to set up comprehensive monitoring for a Kubernetes cluster using kube-state-metrics for object-level metrics, cAdvisor for container metrics, and Prometheus for collection and alerting.

Why It Matters

Kubernetes is dynamic -- pods come and go, nodes scale, and workloads shift. Traditional monitoring tools cannot keep up. You need a monitoring stack designed for ephemeral infrastructure that automatically discovers new targets as they appear.

Real-World Use

The DodaTech infrastructure team manages 15 Kubernetes clusters across three regions. When a node failed in us-east-1, Prometheus detected the node condition change, kube-state-metrics showed the pod redistribution, and cAdvisor reported the resource pressure on remaining nodes -- all within 30 seconds of the failure.

Monitoring Kubernetes requires three layers: node-level metrics (cAdvisor for containers, Node Exporter for hosts), object-level metrics (kube-state-metrics for deployments, services, pods), and control plane metrics (API server, scheduler, controller manager). The Kubernetes monitoring ecosystem is built around Prometheus and its Kubernetes service discovery.


Prerequisites

  • A running Kubernetes cluster (local minikube or cloud-based)
  • Docker installed
  • Basic knowledge of kubectl commands
  • Understanding of Prometheus Introduction

Step-by-Step Tutorial

Step 1: Deploy Prometheus Stack with Helm

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus <a href="/devops/prometheus-grafana/">Prometheus</a>-community/kube-<a href="/devops/prometheus-grafana/">Prometheus</a>-stack --namespace monitoring --create-namespace

Expected output: A complete monitoring stack deployed including Prometheus, Alertmanager, Grafana, kube-state-metrics, and node-exporter.

Step 2: Verify the Deployment

kubectl get pods -n monitoring
kubectl get svc -n monitoring

Look for pods with <a href="/devops/prometheus-grafana/">Prometheus</a>-kube-state-metrics, <a href="/devops/prometheus-grafana/">Prometheus</a>-node-exporter, and <a href="/devops/prometheus-grafana/">Prometheus</a>-server.

Step 3: Explore kube-state-metrics

kube-state-metrics generates metrics about Kubernetes objects. Port-forward and view:

kubectl port-forward -n monitoring svc/<a href="/devops/prometheus-grafana/">Prometheus</a>-kube-state-metrics 8080:8080
curl http://localhost:8080/metrics | head -30

Expected output: Metrics like kube_deployment_status_replicas, kube_pod_status_phase, kube_node_status_condition.

Step 4: Key kube-state-metrics Queries

# Number of running pods
count(kube_pod_status_phase{phase="Running"})

# Deployments with unavailable replicas
kube_deployment_status_replicas_unavailable > 0

# Node memory capacity
kube_node_status_capacity{resource="memory"}

# Pods by node
count by (node) (kube_pod_info)

Step 5: Explore cAdvisor Metrics

cAdvisor is embedded in the kubelet. It exposes container-level metrics:

# Container CPU usage
rate(container_cpu_usage_seconds_total[5m])

# Container memory usage
container_memory_usage_bytes

# Container network receive rate
rate(container_network_receive_bytes_total[5m])

# Container filesystem usage
container_fs_usage_bytes

Step 6: Monitor Kubernetes Control Plane

If your cluster exposes control plane metrics, scrape the API server:

# Add to Prometheus scrape config
- job_name: "kubernetes-apiservers"
  kubernetes_sd_configs:
    - role: endpoints
  scheme: https
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  authorization:
    credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
    - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: default;kubernetes;https

Step 7: Set Up ServiceMonitors for Custom Applications

ServiceMonitor is the Kubernetes custom resource for configuring Prometheus scraping:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: my-app-monitor
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: my-app
  endpoints:
    - port: http
      interval: 15s
  namespaceSelector:
    matchNames:
      - default

Apply it:

kubectl apply -f servicemonitor.yaml

Step 8: Create Kubernetes-Specific Alerts

groups:
  - name: kubernetes
    rules:
      - alert: PodNotRunning
        expr: kube_pod_status_phase{phase=~"Pending|Unknown|Failed"} > 0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Pod {{ $labels.pod }} is not running"

      - alert: NodeNotReady
        expr: kube_node_status_condition{condition="Ready",status="true"} == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Node {{ $labels.node }} is not ready"

      - alert: HighPodRestartRate
        expr: rate(kube_pod_container_status_restarts_total[10m]) > 1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Pod {{ $labels.pod }} is restarting frequently"

Learning Path

flowchart LR
    A[Kubernetes Cluster] --> B[kube-state-metrics]
    A --> C[cAdvisor/kubelet]
    A --> D[Node Exporter]
    B --> E[Prometheus]
    C --> E
    D --> E
    E --> F[Grafana]
    E --> G[Alertmanager]
    B -.-> H[Deployment/Pod/Node metrics]
    C -.-> I[Container CPU/Memory/Network]
    style A fill:#4a90d9,color:#fff
    style E fill:#e67e22,color:#fff

Common Errors

  1. kube-state-metrics shows no data -- The service account does not have sufficient RBAC permissions. Check the ClusterRole bindings for kube-state-metrics.

  2. cAdvisor metrics are missing -- The kubelet is not configured for cAdvisor or the port is blocked. Verify kubelet is listening on port 10250.

  3. Pod restarts cause metric gaps -- Prometheus scrapes targets by pod IP. When a pod restarts, the IP changes. Prometheus discovers the new IP in the next service discovery cycle.

  4. Helm chart installation fails -- The helm Repository URL is wrong or the chart name is incorrect. Verify helm search repo <a href="/devops/prometheus-grafana/">Prometheus</a>-community/kube-<a href="/devops/prometheus-grafana/">Prometheus</a>-stack.

  5. ServiceMonitor does not appear in Prometheus -- The CRD was not installed. Ensure the kube-Prometheus-stack CRDs are present with kubectl get crd.

  6. Control plane scrape returns 403 -- Token authentication is not configured correctly. Verify the service account has permissions to access the API server metrics endpoint.

  7. High memory usage from Prometheus in cluster -- The number of time series is too high for the allocated memory. Increase --storage.tsdb.retention.time or add resource limits.


Practice Questions

  1. What does kube-state-metrics expose? Answer: Metrics about the state of Kubernetes objects: pods, deployments, nodes, services, namespaces, and other resources.

  2. Where does cAdvisor run in a Kubernetes cluster? Answer: cAdvisor is embedded in the kubelet binary on each node. It exposes container-level metrics through the kubelet API.

  3. What is a ServiceMonitor in the Prometheus Operator ecosystem? Answer: A custom resource that defines how Prometheus should scrape metrics from a set of Kubernetes services, including label selectors and port configuration.

  4. How does Prometheus discover targets in Kubernetes? Answer: Through Kubernetes_sd_configs with roles like pod, service, endpoints, node, and ingress.

  5. What is the purpose of the kube-Prometheus-stack Helm chart? Answer: It deploys a complete Prometheus monitoring stack for Kubernetes, including Prometheus, Alertmanager, Grafana, kube-state-metrics, and node-exporter with preconfigured dashboards and alerts.


Challenge

Deploy the kube-Prometheus-stack Helm chart to a Kubernetes cluster with three worker nodes. Verify that kube-state-metrics exposes deployment, pod, and node metrics. Confirm cAdvisor metrics are available from each node. Create a custom ServiceMonitor for a sample application deployed with 3 replicas. Write Prometheus alerting rules for: node not ready (critical), pod in CrashLoopBackOff (critical), and node disk pressure (warning). Import the Kubernetes cluster Grafana dashboard (ID 315) and verify all panels show data. Generate load on one node and observe the dashboard change.


FAQ

Do I need kube-state-metrics for Kubernetes monitoring?

Yes. Without it, you only have container and node metrics from cAdvisor and Node Exporter. kube-state-metrics is essential for understanding the state of Kubernetes objects.

Is cAdvisor mandatory in Kubernetes?

cAdvisor is compiled into the kubelet. You do not need to install it separately. It exposes container metrics through the kubelet API by default.

Can I use Prometheus without the Prometheus Operator in Kubernetes?

Yes, you can deploy Prometheus as a standalone pod or deployment with manual configuration files. The Operator simplifies the process with CRDs.

What is the default retention for Prometheus in the Helm chart?

The default retention is 10 days. You can adjust it with the <a href="/devops/prometheus-grafana/">Prometheus</a>.prometheusSpec.retention Helm value.

Does this stack work with managed Kubernetes (EKS, GKE, AKS)?

Yes, it works with all major Kubernetes providers. You may need to adjust network policies or firewall rules for control plane access.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro