Comprehensive guide8 min read

Create Effective Grafana Dashboards for Kubernetes Monitoring

SFEIR Institute

Key Takeaways

  • 75% of Kubernetes organizations use Prometheus and Grafana for monitoring
  • 6 steps to create operational dashboards in under 2 hours

TL;DR: This guide shows you how to create effective Grafana dashboards for Kubernetes monitoring in 6 steps: installing Grafana, connecting to Prometheus, creating cluster and pod visualizations, configuring alerts, and optimizing performance. You'll have operational dashboards in under 2 hours.

To deepen these skills, discover the LFD459 Kubernetes for Application Developers training.


Why Create Grafana Dashboards for Kubernetes?

Grafana Kubernetes metrics visualization has become the industry standard. According to Grafana Labs, 75% of organizations using Kubernetes adopt Prometheus and Grafana for their monitoring. This combination provides complete visibility into your cluster health.

"Kubernetes is no longer experimental but foundational. Soon, it will be essential to AI as well." - Chris Aniszczyk, CNCF State of Cloud Native 2026

A well-designed dashboard allows you to:

  • Detect anomalies before they impact production
  • Reduce MTTR (Mean Time To Recovery) by 40 to 60%
  • Correlate metrics between infrastructure and applications
Key takeaway: Without adapted dashboards, you're monitoring without understanding. Grafana transforms raw metrics into actionable insights.

Prerequisites

Before starting, verify these elements:

ComponentMinimum VersionVerification
Kubernetes1.28+kubectl version --short
Helm3.12+helm version
PrometheusInstalledkubectl get pods -n monitoring -l app=prometheus
kubectlConfiguredkubectl cluster-info

Verify your cluster:

kubectl get nodes
# Expected result:
# NAME      STATUS   ROLES           AGE   VERSION
# master    Ready    control-plane   30d   v1.29.2
# worker1   Ready    <none>          30d   v1.29.2
# worker2   Ready    <none>          30d   v1.29.2

If Prometheus is not installed, consult our guide Deploy the complete kube-prometheus stack in production environment.


Step 1: Install Grafana on Kubernetes with Helm

1.1 Add the Helm Grafana repository

Run these commands to configure Helm:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Expected result:
# "grafana" has been added to your repositories
# Hang tight while we grab the latest from your chart repositories...
# ...Successfully got an update from the "grafana" chart repository

1.2 Create the monitoring namespace

kubectl create namespace monitoring --dry-run=client -o yaml | kubectl apply -f -
# Expected result:
# namespace/monitoring created (or unchanged if existing)

1.3 Deploy Grafana with optimized configuration

Create the values.yaml file:

# grafana-values.yaml
persistence:
enabled: true
size: 10Gi

adminPassword: "YourSecurePassword123!"

datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: http://prometheus-server.monitoring.svc.cluster.local
access: proxy
isDefault: true

dashboardProviders:
dashboardproviders.yaml:
apiVersion: 1
providers:
- name: 'default'
folder: ''
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboards/default

resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi

Install Grafana:

helm install grafana grafana/grafana \
--namespace monitoring \
--values grafana-values.yaml \
--version 7.3.0
# Expected result:
# NAME: grafana
# NAMESPACE: monitoring
# STATUS: deployed
# REVISION: 1

1.4 Verify deployment

kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana
# Expected result:
# NAME                       READY   STATUS    RESTARTS   AGE
# grafana-7d5b6b8f4c-x2kj9   1/1     Running   0          2m
Key takeaway: Persistence is essential. Without PVC, your dashboards disappear when the pod restarts.

Step 2: Configure Prometheus Data Source

2.1 Access the Grafana interface

Expose Grafana temporarily:

kubectl port-forward svc/grafana -n monitoring 3000:80 &
# Expected result:
# Forwarding from 127.0.0.1:3000 -> 3000

Access http://localhost:3000 with credentials:

  • User: admin
  • Password: YourSecurePassword123!

2.2 Verify Prometheus connection

If you used the values.yaml above, Prometheus is already configured. Verify the connection:

  1. Go to Configuration → Data Sources
  2. Click on Prometheus
  3. Click on Test
# Expected result:
# ✓ Data source is working

If the connection fails, verify the Prometheus service URL:

kubectl get svc -n monitoring | grep prometheus
# Expected result:
# prometheus-server   ClusterIP   10.96.45.123   <none>   80/TCP   30d

For teams preparing for CKAD certification, mastering these interconnections is covered in the LFD459 Kubernetes for Application Developers training.


Step 3: Create a Cluster Overview Dashboard

3.1 Create a new dashboard

  1. Click on + → New Dashboard
  2. Click on Add visualization
  3. Select Prometheus as the source

3.2 Add cluster CPU panel

Configure the PromQL query:

sum(rate(container_cpu_usage_seconds_total{namespace!="kube-system"}[5m])) by (namespace)

Panel parameters:

ParameterValue
TitleCPU by namespace
VisualizationTime series
Legend{{namespace}}
Unitpercent (0-100)

3.3 Add cluster memory panel

sum(container_memory_working_set_bytes{namespace!="kube-system"}) by (namespace) / 1024 / 1024 / 1024

Configuration:

  • Title: Memory by namespace (GiB)
  • Unit: gibibytes

3.4 Add pods by state

sum(kube_pod_status_phase) by (phase)

Create a stat panel with these values:

# Panel configuration
Visualization: Stat
Calculation: Last
Color mode: Value
Graph mode: None
Text mode: Value and name

According to the CNCF 2025 report, 82% of container users run Kubernetes in production, making this type of monitoring essential.

Key takeaway: Always start with a global view before drilling down to pod level. This top-down approach accelerates diagnosis.

Step 4: Create a Pod Monitoring Dashboard

4.1 Dashboard with dynamic variables

Add variables to filter dynamically:

  1. Go to Dashboard Settings → Variables
  2. Create the namespace variable:
Name: namespace
Type: Query
Data source: Prometheus
Query: label_values(kube_pod_info, namespace)
Refresh: On dashboard load
  1. Create the pod variable:
Name: pod
Type: Query
Query: label_values(kube_pod_info{namespace="$namespace"}, pod)
Refresh: On time range change

4.2 CPU per pod panel

sum(rate(container_cpu_usage_seconds_total{namespace="$namespace", pod="$pod"}[5m])) by (container)

4.3 Memory per pod panel

sum(container_memory_working_set_bytes{namespace="$namespace", pod="$pod"}) by (container) / 1024 / 1024

4.4 Network I/O panel

# Received bytes
sum(rate(container_network_receive_bytes_total{namespace="$namespace", pod="$pod"}[5m]))

# Transmitted bytes
sum(rate(container_network_transmit_bytes_total{namespace="$namespace", pod="$pod"}[5m]))

Use a graph with two series:

  • Receive bytes: green color
  • Transmit bytes: blue color

For diagnosing pod problems, see Debug a pod in CrashLoopBackOff on Kubernetes.


Step 5: Configure Grafana Alerts

5.1 Create a CPU alert rule

  1. Edit a CPU panel
  2. Go to the Alert tab
  3. Click on Create alert rule

Alert configuration:

Alert name: High CPU Usage
Evaluate every: 1m
For: 5m
Condition: WHEN avg() OF query(A, 5m, now) IS ABOVE 80

5.2 Configure a contact point

# Example Slack webhook configuration
kubectl create secret generic grafana-slack-webhook \
--from-literal=url='https://hooks.slack.com/services/XXX/YYY/ZZZ' \
-n monitoring

In Grafana:

  1. Alerting → Contact points → New contact point
  2. Type: Slack
  3. Webhook URL: $(SLACK_WEBHOOK_URL)

5.3 Verify configured alerts

kubectl exec -it $(kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana -o jsonpath='{.items[0].metadata.name}') \
-n monitoring -- grafana-cli admin stats
# Expected result:
# Active alerts: 3
# Dashboard count: 5
"Don't let your knowledge remain theoretical - set up a real Kubernetes environment to solidify your skills." - TealHQ Kubernetes DevOps Guide

Step 6: Optimize Dashboard Performance

6.1 Reduce query cardinality

Bad practice:

# ❌ Explosive cardinality
container_cpu_usage_seconds_total

Good practice:

# ✓ Immediate aggregation
sum by (namespace, pod) (rate(container_cpu_usage_seconds_total[5m]))

6.2 Configure caching

Add these parameters to values.yaml:

grafana.ini:
dataproxy:
timeout: 30
keep_alive_seconds: 30
caching:
backend: database

6.3 Define appropriate refresh intervals

Dashboard typeRecommended interval
Real-time view10s
Normal operations30s
Historical reports5m

Configure in Dashboard Settings:

Auto-refresh: 30s
Time range: Last 6 hours

For a complete approach to Kubernetes Monitoring and Troubleshooting, explore our other practical guides.


Troubleshooting Common Issues

Dashboard doesn't load data

Verify Prometheus connectivity:

kubectl exec -it $(kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana -o jsonpath='{.items[0].metadata.name}') \
-n monitoring -- wget -qO- http://prometheus-server.monitoring.svc.cluster.local/api/v1/status/runtimeinfo
# Expected result:
# {"status":"success","data":{...}}

PromQL queries too slow

Analyze with:

# Check cardinality
count by (__name__)({__name__=~".+"})

If a metric exceeds 100,000 series, add filters:

# Filter by namespace
sum(rate(container_cpu_usage_seconds_total{namespace=~"prod|staging"}[5m]))

Grafana pod in CrashLoopBackOff

kubectl logs -n monitoring -l app.kubernetes.io/name=grafana --tail=50
# Look for permission or PVC errors

Common solution:

kubectl delete pvc grafana -n monitoring
helm upgrade grafana grafana/grafana -n monitoring --values grafana-values.yaml

Consult the complete guide Debug a pod in CrashLoopBackOff for more complex cases.


Import these dashboards from Grafana.com:

IDNameUsage
315Kubernetes cluster monitoringGlobal view
13332kube-state-metrics v2K8s object states
6417Kubernetes PodsPod detail
14205Node Exporter FullSystem metrics

Import via CLI:

curl -s https://grafana.com/api/dashboards/315/revisions/latest/download \
| kubectl exec -i $(kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana -o jsonpath='{.items[0].metadata.name}') \
-n monitoring -- grafana-cli admin import-dashboard

To master the complete observability stack, see our guide Complete guide: install and configure Prometheus on Kubernetes.


Take Action: Get Kubernetes Monitoring Training

Creating effective Grafana dashboards for Kubernetes monitoring is an essential skill for any infrastructure engineer or Cloud-Native developer. With 82% of organizations using Kubernetes in production (CNCF 2025), this expertise positions you in a market where average salary reaches $152,640/year (Ruby On Remote).

"Demand and salaries for highly-skilled and qualified tech talent are fiercer than ever, and certifications present a clear pathway for IT professionals to further their careers." - Hired CTO via Splunk

Recommended trainings:

Explore our other resources:

Contact our advisors to plan your Kubernetes skills development.