Create Effective Grafana Dashboards for Kubernetes Monitoring

TL;DR: This guide shows you how to create effective Grafana dashboards for Kubernetes monitoring in 6 steps: installing Grafana, connecting to Prometheus, creating cluster and pod visualizations, configuring alerts, and optimizing performance. You'll have operational dashboards in under 2 hours.

To deepen these skills, discover the LFD459 Kubernetes for Application Developers training.

Why Create Grafana Dashboards for Kubernetes?

Grafana Kubernetes metrics visualization has become the industry standard. According to Grafana Labs, 75% of organizations using Kubernetes adopt Prometheus and Grafana for their monitoring. This combination provides complete visibility into your cluster health.

"Kubernetes is no longer experimental but foundational. Soon, it will be essential to AI as well." - Chris Aniszczyk, CNCF State of Cloud Native 2026

A well-designed dashboard allows you to:

Detect anomalies before they impact production
Reduce MTTR (Mean Time To Recovery) by 40 to 60%
Correlate metrics between infrastructure and applications

Key takeaway: Without adapted dashboards, you're monitoring without understanding. Grafana transforms raw metrics into actionable insights.

Prerequisites

Before starting, verify these elements:

Component	Minimum Version	Verification
Kubernetes	1.28+	`kubectl version --short`
Helm	3.12+	`helm version`
Prometheus	Installed	`kubectl get pods -n monitoring -l app=prometheus`
kubectl	Configured	`kubectl cluster-info`

Verify your cluster:

kubectl get nodes
# Expected result:
# NAME      STATUS   ROLES           AGE   VERSION
# master    Ready    control-plane   30d   v1.29.2
# worker1   Ready    <none>          30d   v1.29.2
# worker2   Ready    <none>          30d   v1.29.2

If Prometheus is not installed, consult our guide Deploy the complete kube-prometheus stack in production environment.

Step 1: Install Grafana on Kubernetes with Helm

1.1 Add the Helm Grafana repository

Run these commands to configure Helm:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Expected result:
# "grafana" has been added to your repositories
# Hang tight while we grab the latest from your chart repositories...
# ...Successfully got an update from the "grafana" chart repository

1.2 Create the monitoring namespace

kubectl create namespace monitoring --dry-run=client -o yaml | kubectl apply -f -
# Expected result:
# namespace/monitoring created (or unchanged if existing)

1.3 Deploy Grafana with optimized configuration

Create the values.yaml file:

# grafana-values.yaml
persistence:
enabled: true
size: 10Gi

adminPassword: "YourSecurePassword123!"

datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: http://prometheus-server.monitoring.svc.cluster.local
access: proxy
isDefault: true

dashboardProviders:
dashboardproviders.yaml:
apiVersion: 1
providers:
- name: 'default'
folder: ''
type: file
disableDeletion: false
editable: true
options:
path: /var/lib/grafana/dashboards/default

resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi

Install Grafana:

helm install grafana grafana/grafana \
--namespace monitoring \
--values grafana-values.yaml \
--version 7.3.0
# Expected result:
# NAME: grafana
# NAMESPACE: monitoring
# STATUS: deployed
# REVISION: 1

1.4 Verify deployment

kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana
# Expected result:
# NAME                       READY   STATUS    RESTARTS   AGE
# grafana-7d5b6b8f4c-x2kj9   1/1     Running   0          2m

Key takeaway: Persistence is essential. Without PVC, your dashboards disappear when the pod restarts.

Step 2: Configure Prometheus Data Source

2.1 Access the Grafana interface

Expose Grafana temporarily:

kubectl port-forward svc/grafana -n monitoring 3000:80 &
# Expected result:
# Forwarding from 127.0.0.1:3000 -> 3000

Access http://localhost:3000 with credentials:

User: admin
Password: YourSecurePassword123!

2.2 Verify Prometheus connection

If you used the values.yaml above, Prometheus is already configured. Verify the connection:

Go to Configuration → Data Sources
Click on Prometheus
Click on Test

# Expected result:
# ✓ Data source is working

If the connection fails, verify the Prometheus service URL:

kubectl get svc -n monitoring | grep prometheus
# Expected result:
# prometheus-server   ClusterIP   10.96.45.123   <none>   80/TCP   30d

For teams preparing for CKAD certification, mastering these interconnections is covered in the LFD459 Kubernetes for Application Developers training.

Step 3: Create a Cluster Overview Dashboard

3.1 Create a new dashboard

Click on + → New Dashboard
Click on Add visualization
Select Prometheus as the source

3.2 Add cluster CPU panel

Configure the PromQL query:

sum(rate(container_cpu_usage_seconds_total{namespace!="kube-system"}[5m])) by (namespace)

Panel parameters:

Parameter	Value
Title	CPU by namespace
Visualization	Time series
Legend	{{namespace}}
Unit	percent (0-100)

3.3 Add cluster memory panel

sum(container_memory_working_set_bytes{namespace!="kube-system"}) by (namespace) / 1024 / 1024 / 1024

Configuration:

Title: Memory by namespace (GiB)
Unit: gibibytes

3.4 Add pods by state

sum(kube_pod_status_phase) by (phase)

Create a stat panel with these values:

# Panel configuration
Visualization: Stat
Calculation: Last
Color mode: Value
Graph mode: None
Text mode: Value and name

According to the CNCF 2025 report, 82% of container users run Kubernetes in production, making this type of monitoring essential.

Key takeaway: Always start with a global view before drilling down to pod level. This top-down approach accelerates diagnosis.

Step 4: Create a Pod Monitoring Dashboard

4.1 Dashboard with dynamic variables

Add variables to filter dynamically:

Go to Dashboard Settings → Variables
Create the namespace variable:

Name: namespace
Type: Query
Data source: Prometheus
Query: label_values(kube_pod_info, namespace)
Refresh: On dashboard load

Create the pod variable:

Name: pod
Type: Query
Query: label_values(kube_pod_info{namespace="$namespace"}, pod)
Refresh: On time range change

4.2 CPU per pod panel

sum(rate(container_cpu_usage_seconds_total{namespace="$namespace", pod="$pod"}[5m])) by (container)

4.3 Memory per pod panel

sum(container_memory_working_set_bytes{namespace="$namespace", pod="$pod"}) by (container) / 1024 / 1024

4.4 Network I/O panel

# Received bytes
sum(rate(container_network_receive_bytes_total{namespace="$namespace", pod="$pod"}[5m]))

# Transmitted bytes
sum(rate(container_network_transmit_bytes_total{namespace="$namespace", pod="$pod"}[5m]))

Use a graph with two series:

Receive bytes: green color
Transmit bytes: blue color

For diagnosing pod problems, see Debug a pod in CrashLoopBackOff on Kubernetes.

Step 5: Configure Grafana Alerts

5.1 Create a CPU alert rule

Edit a CPU panel
Go to the Alert tab
Click on Create alert rule

Alert configuration:

Alert name: High CPU Usage
Evaluate every: 1m
For: 5m
Condition: WHEN avg() OF query(A, 5m, now) IS ABOVE 80

5.2 Configure a contact point

# Example Slack webhook configuration
kubectl create secret generic grafana-slack-webhook \
--from-literal=url='https://hooks.slack.com/services/XXX/YYY/ZZZ' \
-n monitoring

In Grafana:

Alerting → Contact points → New contact point
Type: Slack
Webhook URL: $(SLACK_WEBHOOK_URL)

5.3 Verify configured alerts

kubectl exec -it $(kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana -o jsonpath='{.items[0].metadata.name}') \
-n monitoring -- grafana-cli admin stats
# Expected result:
# Active alerts: 3
# Dashboard count: 5

"Don't let your knowledge remain theoretical - set up a real Kubernetes environment to solidify your skills." - [TealHQ Kubernetes DevOps Guide]

Step 6: Optimize Dashboard Performance

6.1 Reduce query cardinality

Bad practice:

# ❌ Explosive cardinality
container_cpu_usage_seconds_total

Good practice:

# ✓ Immediate aggregation
sum by (namespace, pod) (rate(container_cpu_usage_seconds_total[5m]))

6.2 Configure caching

Add these parameters to values.yaml:

grafana.ini:
dataproxy:
timeout: 30
keep_alive_seconds: 30
caching:
backend: database

6.3 Define appropriate refresh intervals

Dashboard type	Recommended interval
Real-time view	10s
Normal operations	30s
Historical reports	5m

Configure in Dashboard Settings:

Auto-refresh: 30s
Time range: Last 6 hours

For a complete approach to Kubernetes Monitoring and Troubleshooting, explore our other practical guides.

Troubleshooting Common Issues

Dashboard doesn't load data

Verify Prometheus connectivity:

kubectl exec -it $(kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana -o jsonpath='{.items[0].metadata.name}') \
-n monitoring -- wget -qO- http://prometheus-server.monitoring.svc.cluster.local/api/v1/status/runtimeinfo
# Expected result:
# {"status":"success","data":{...}}

PromQL queries too slow

Analyze with:

# Check cardinality
count by (__name__)({__name__=~".+"})

If a metric exceeds 100,000 series, add filters:

# Filter by namespace
sum(rate(container_cpu_usage_seconds_total{namespace=~"prod|staging"}[5m]))

Grafana pod in CrashLoopBackOff

kubectl logs -n monitoring -l app.kubernetes.io/name=grafana --tail=50
# Look for permission or PVC errors

Common solution:

kubectl delete pvc grafana -n monitoring
helm upgrade grafana grafana/grafana -n monitoring --values grafana-values.yaml

Consult the complete guide Debug a pod in CrashLoopBackOff for more complex cases.

Recommended Community Dashboards

Import these dashboards from Grafana.com:

ID	Name	Usage
315	Kubernetes cluster monitoring	Global view
13332	kube-state-metrics v2	K8s object states
6417	Kubernetes Pods	Pod detail
14205	Node Exporter Full	System metrics

Import via CLI:

curl -s https://grafana.com/api/dashboards/315/revisions/latest/download \
| kubectl exec -i $(kubectl get pods -n monitoring -l app.kubernetes.io/name=grafana -o jsonpath='{.items[0].metadata.name}') \
-n monitoring -- grafana-cli admin import-dashboard

To master the complete observability stack, see our guide Complete guide: install and configure Prometheus on Kubernetes.

Take Action: Get Kubernetes Monitoring Training

Creating effective Grafana dashboards for Kubernetes monitoring is an essential skill for any infrastructure engineer or Cloud-Native developer. With 82% of organizations using Kubernetes in production (CNCF 2025), this expertise positions you in a market where average salary reaches $152,640/year

"Demand and salaries for highly-skilled and qualified tech talent are fiercer than ever, and certifications present a clear pathway for IT professionals to further their careers." - Hired CTO via Splunk

Recommended trainings:

LFS458 Kubernetes Administration: Master complete Kubernetes cluster administration (4 days, CKA preparation)
LFD459 Kubernetes for Application Developers: Develop and deploy containerized applications (3 days, CKAD preparation)
Kubernetes Fundamentals: Discover essential concepts in one day

Explore our other resources:

Contact our advisors to plan your Kubernetes skills development.

Key Takeaways