Key Takeaways
- ✓67% of organizations use Prometheus in production for their Kubernetes clusters (Grafana Labs 2025)
- ✓Complete deployment via Helm with metrics scraping and configurable alerts in 45 minutes
Are you managing a production Kubernetes cluster and need robust observability? This complete guide walks you through step by step to install and configure Prometheus on Kubernetes in 2026. According to the Grafana Labs 2025 Observability Survey, 67% of organizations use Prometheus in production for their Kubernetes clusters.
TL;DR: This guide shows you how to deploy Prometheus via Helm, configure metrics scraping, create your first alerts, and verify your installation. Estimated time: 45 minutes.
To master these skills in depth, discover the LFS458 Kubernetes Administration training.
What is Prometheus and why use it on Kubernetes?
Prometheus is an open-source monitoring and alerting system designed for cloud-native environments. Initially developed by SoundCloud, it has been part of the Cloud Native Computing Foundation since 2016.
Prometheus collects metrics in pull mode via HTTP. You configure endpoints that Prometheus queries at regular intervals. This approach simplifies your network architecture.
Key takeaway: Prometheus stores metrics as time series identified by a name and key-value labels.
Why is Prometheus ideal for Kubernetes?
Kubernetes natively exposes metrics in Prometheus format. Your components (kubelet, kube-apiserver, etcd) already provide /metrics endpoints. You just need to scrape them.
| Component | Default port | Key metrics |
|---|---|---|
| kubelet | 10250 | Pod, container resources |
| kube-apiserver | 6443 | Request latency, errors |
| etcd | 2379 | Cluster health, latency |
| kube-scheduler | 10259 | Pending pods |
As Björn Rabenstein, Prometheus co-creator, explains: "Prometheus was designed from the start for dynamic environments where instances appear and disappear constantly" (PromCon 2023).
Prerequisites before installation
Before starting your installation, verify that you have the following elements. This step will save you problems during deployment.
Technical prerequisites
Your cluster must meet these minimum criteria:
- Kubernetes 1.28+ (recommended version in 2026)
- kubectl configured and functional
- Helm 3.14+ installed on your workstation
- Persistent storage: at least 50 Gi available
- RBAC enabled on your cluster
Check your Kubernetes version:
kubectl version --short
# Expected result: Client Version: v1.29.x / Server Version: v1.29.x
Skill prerequisites
You must master fundamental Kubernetes concepts. Consult our guide Containerization and Docker best practices if you're starting out.
For Kubernetes infrastructure engineers preparing for CKA, this installation is part of evaluated skills. The LFS458 training covers this topic in detail.
Key takeaway: Plan 2 CPU and 4 Gi RAM minimum for Prometheus in a test environment.
Step 1: Prepare your namespace and RBAC permissions
You'll create a dedicated monitoring namespace. This isolation facilitates rights management and maintenance.
Create monitoring namespace
Execute this command to create your namespace:
kubectl create namespace monitoring
# Result: namespace/monitoring created
Verify creation:
kubectl get namespaces | grep monitoring
# Result: monitoring Active 5s
Configure ServiceAccount and ClusterRoles
Prometheus needs permissions to discover and scrape pods. Create this file prometheus-rbac.yaml:
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources: ["nodes", "pods", "services", "endpoints"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: monitoring
Apply this configuration:
kubectl apply -f prometheus-rbac.yaml
# Result: serviceaccount/prometheus created
# clusterrole.rbac.authorization.k8s.io/prometheus created
# clusterrolebinding.rbac.authorization.k8s.io/prometheus created
Step 2: Install Prometheus with Helm
Helm significantly simplifies your installation. You benefit from optimized configurations and facilitated updates.
Add Helm repository
Add the prometheus-community repository to your Helm configuration:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Result: "prometheus-community" has been added to your repositories
# Update Complete. Happy Helming!
Create your custom values.yaml file
Create a prometheus-values.yaml file adapted to your environment:
# prometheus-values.yaml - 2026 Configuration
prometheus:
prometheusSpec:
retention: 15d
retentionSize: "45GB"
resources:
requests:
memory: "2Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "1000m"
storageSpec:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 50Gi
serviceMonitorSelectorNilUsesHelmValues: false
podMonitorSelectorNilUsesHelmValues: false
alertmanager:
enabled: true
alertmanagerSpec:
storage:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
grafana:
enabled: true
adminPassword: "your-secure-password"
This configuration lets you customize data retention, allocated resources, and persistent storage.
Launch installation
Deploy the complete stack with this command:
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--values prometheus-values.yaml \
--version 58.0.0
Installation takes 2 to 5 minutes. Monitor progress:
kubectl get pods -n monitoring -w
Expected result after deployment:
NAME READY STATUS RESTARTS AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 0 2m
prometheus-grafana-7c9bc466d5-x2p4q 3/3 Running 0 2m
prometheus-kube-prometheus-operator-5c4b9c8f55-kx8nj 1/1 Running 0 2m
prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 0 2m
Key takeaway: Version 58.0.0 of kube-prometheus-stack chart includes Prometheus 2.51, Grafana 10.4, and Alertmanager 0.27.
To deepen production deployment, consult our guide Deploy the complete kube-prometheus stack in production.
Step 3: Configure metrics scraping
Your Prometheus is installed. Now, configure it to collect metrics from your applications.
Understanding ServiceMonitors
A ServiceMonitor is a custom resource (CRD) that defines how Prometheus discovers and scrapes your services. You thus avoid manually modifying Prometheus configuration.
Create this file app-servicemonitor.yaml to monitor your application:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: my-app-monitor
namespace: monitoring
labels:
release: prometheus
spec:
selector:
matchLabels:
app: my-app
namespaceSelector:
matchNames:
- default
- production
endpoints:
- port: metrics
interval: 30s
path: /metrics
scrapeTimeout: 10s
Apply this configuration:
kubectl apply -f app-servicemonitor.yaml
# Result: servicemonitor.monitoring.coreos.com/my-app-monitor created
Verify discovered targets
Access the Prometheus interface to verify your targets:
kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090
Open http://localhost:9090/targets in your browser. You should see your endpoints with "UP" status.
Configure annotations for auto-discovery
You can also use Kubernetes annotations. Add these annotations to your pods:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
This method suits simple deployments. For complex environments, prefer ServiceMonitors.
Step 4: Create your first alerting rules
Monitoring without alerting is incomplete. Configure alerts to be notified of problems.
Create a PrometheusRule
Create the file alerting-rules.yaml:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: kubernetes-alerts
namespace: monitoring
labels:
release: prometheus
spec:
groups:
- name: kubernetes.rules
rules:
- alert: PodCrashLooping
expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Pod {{ $labels.pod }} is frequently restarting"
description: "Pod {{ $labels.pod }} in {{ $labels.namespace }} has restarted {{ $value }} times."
- alert: HighMemoryUsage
expr: (container_memory_usage_bytes / container_spec_memory_limit_bytes) > 0.9
for: 5m
labels:
severity: critical
annotations:
summary: "Critical memory usage"
description: "Container {{ $labels.container }} is using more than 90% of its memory limit."
Apply these rules:
kubectl apply -f alerting-rules.yaml
# Result: prometheusrule.monitoring.coreos.com/kubernetes-alerts created
Verify that your rules are loaded:
kubectl get prometheusrules -n monitoring
# Result: NAME AGE
# kubernetes-alerts 10s
According to the Dynatrace 2024 observability report, teams with well-configured alerts reduce their MTTR by 43%.
Step 5: Configure Alertmanager for notifications
Alertmanager handles routing your alerts. Configure it to receive notifications via email, Slack, or PagerDuty.
Create a Secret for Alertmanager configuration
Create the file alertmanager-config.yaml:
apiVersion: v1
kind: Secret
metadata:
name: alertmanager-prometheus-kube-prometheus-alertmanager
namespace: monitoring
type: Opaque
stringData:
alertmanager.yaml: |
global:
resolve_timeout: 5m
slack_api_url: 'https://hooks.slack.com/services/YOUR/WEBHOOK/URL'
route:
group_by: ['alertname', 'severity']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
receiver: 'slack-notifications'
routes:
- match:
severity: critical
receiver: 'slack-critical'
receivers:
- name: 'slack-notifications'
slack_configs:
- channel: '#monitoring'
send_resolved: true
- name: 'slack-critical'
slack_configs:
- channel: '#alerts-critical'
send_resolved: true
Apply this configuration:
kubectl apply -f alertmanager-config.yaml
# Result: secret/alertmanager-prometheus-kube-prometheus-alertmanager configured
Restart Alertmanager to apply the new configuration:
kubectl rollout restart statefulset alertmanager-prometheus-kube-prometheus-alertmanager -n monitoring
To create visualization dashboards, consult our guide Create performant Grafana dashboards for Kubernetes monitoring.
Step 6: Verify your installation
Your stack is deployed. Verify that everything works correctly before going to production.
Test interface access
Create port-forwards to access interfaces:
# Prometheus
kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090 &
# Grafana
kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80 &
# Alertmanager
kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-alertmanager 9093:9093 &
Execute a test PromQL query
Test your installation with this query in the Prometheus interface:
up{job="kubelet"}
You should get a value of 1 for each kubelet in your cluster.
Verify Kubernetes component metrics
Execute these queries to validate collection:
# Number of pods per namespace
count by (namespace) (kube_pod_info)
# Node CPU usage
node_cpu_seconds_total{mode="idle"}
# API server metrics
apiserver_request_total
Key takeaway: If a metric is missing, first check targets in Prometheus, then the concerned component logs.
Troubleshooting common problems
Having difficulties? Here are solutions to frequent problems.
Prometheus pods don't start
Check pod events:
kubectl describe pod prometheus-prometheus-kube-prometheus-prometheus-0 -n monitoring
Frequent causes:
- PVC not created: check your StorageClass
- Insufficient resources: increase requests/limits
- Incorrect RBAC: check ClusterRoleBindings
Targets appear as "DOWN"
Check network connectivity:
kubectl exec -n monitoring prometheus-prometheus-kube-prometheus-prometheus-0 -- \
wget -qO- http://kubernetes.default.svc:443/metrics --no-check-certificate
Consult our hub Kubernetes monitoring and troubleshooting for in-depth troubleshooting guides.
Alerts are not sent
Check Alertmanager configuration:
kubectl logs -n monitoring alertmanager-prometheus-kube-prometheus-alertmanager-0
Test manually with:
curl -X POST http://localhost:9093/api/v1/alerts \
-H "Content-Type: application/json" \
-d '[{"labels":{"alertname":"Test"}}]'
Best practices for production
For a robust production deployment, apply these recommendations from field experience.
Sizing and retention
| Environment | Retention | Storage | RAM |
|---|---|---|---|
| Dev/Test | 7 days | 20 Gi | 2 Gi |
| Staging | 15 days | 50 Gi | 4 Gi |
| Production | 30 days | 200 Gi | 8 Gi |
High availability
For critical environments, deploy Prometheus in HA mode:
prometheus:
prometheusSpec:
replicas: 2
replicaExternalLabelName: "__replica__"
Cloud operations Kubernetes engineers preparing for CKA certification must master these configurations. The LFS458 training covers these advanced aspects.
Conclusion and next steps
You now have a functional Prometheus stack on your Kubernetes cluster. This complete guide has walked you through from installation to alert configuration.
Step summary
- Namespace and RBAC preparation
- Installation via Helm
- Scraping configuration
- Alerting rules creation
- Notification configuration
- Verification and tests
Explore our Kubernetes Training thematic map to discover all related topics. For an overview, consult the Complete Kubernetes Training Guide.
Key takeaway: Effective monitoring combines metrics, logs, and traces. Prometheus is the foundation on which you build your complete observability.
Take action: train with SFEIR Institute
Want to deepen your Kubernetes monitoring skills? SFEIR Institute offers certification training adapted to your level.
Recommended training:
- LFS458 Kubernetes Administration: 4 days to master cluster administration, including monitoring and observability. Prepares for CKA certification.
- LFD459 Kubernetes for developers: 3 days to integrate observability best practices into your cloud-native applications. Prepares for CKAD.
- Kubernetes fundamentals: 1 day to discover essential concepts, including an introduction to monitoring.
Contact our advisors to build your personalized training path and explore OPCO funding possibilities.