How to Monitor Your Kubernetes Cluster with Prometheus and Grafana

How to Monitor Your Kubernetes Cluster with Prometheus and Grafana

A Step-by-Step Guide to setup Grafana and Prometheus

Kubernetes is a very popular tool used to deploy, scale, and manage containerized applications across multiple servers. These multiple servers can also be referred to as compute nodes or as they better known, clusters.

Kubernetes clusters are very crucial to the success of the containerized applications, and if their performance is not optimal, the application's performance will simultaneously suffer. For this reason it is very important that you monitor the health and performance of your clusters to ensure that the applications run smoothly so that any probable problem can be resolved quickly.

This is where Grafana and Prometheus step in. These tools combined together, can help you effectively monitor you kubernetes cluster.

In this guide, you'll learn how to use Helm to connect Prometheus and Grafana on Kubernetes. You will also learn how to build a basic Grafana dashboard.

Prerequisites

Before you can begin setting up Prometheus and Grafana for monitoring your Kubernetes cluster, it's important that you have the necessary prerequisites in place:

  1. Kubernetes Cluster: You must have a functioning Kubernetes cluster up and running. If you haven't set up a cluster yet, refer to official documentation.

  2. kubectl Command-Line Tool: Familiarity with kubectl is essential, as it will be your primary interface for interacting with the Kubernetes cluster. Ensure that kubectl is installed and configured correctly on your local machine.

  3. Helm Package Manager: Helm simplifies the deployment of complex Kubernetes applications, making it an invaluable tool for our setup. Make sure you have Helm installed and configured in your Kubernetes environment. If you haven't installed Helm yet, consult the Helm documentation for instructions.

  4. Basic Knowledge of Kubernetes Concepts: While we will walk through the setup process step by step, having a fundamental understanding of Kubernetes concepts such as Pods, Services, ConfigMaps, and Deployments will greatly enhance your comprehension of the monitoring setup.

What is Prometheus?

Prometheus is an open-source monitoring tool, and it has become popular for its ability to collect and store metrics from various components of a Kubernetes cluster. Data visualization tools, such as Grafana, use it to fetch data for their charts and graph.

Some of the metrics Prometheus collects from Kubernetes clusters:

  • Cluster-level metrics: These metrics provide an overview of the health and performance of the entire cluster, such as the number of nodes, pods, and services, the cluster capacity and usage, the cluster availability and latency, and the cluster error rate.

  • Node-level metrics: These metrics provide information about the individual nodes in the cluster, such as the node name, role, and labels, the node CPU, memory, disk, and network usage, the node uptime and status, and the node events and errors.

  • Pod-level metrics: These metrics provide information about the individual pods in the cluster, such as the pod name, namespace, and labels, the pod CPU, memory, disk, and network usage, the pod status and phase, and the pod events and errors.

  • Container-level metrics: These metrics provide information about the individual containers in the cluster, such as the container name, image, and ID, the container CPU, memory, disk, and network usage, the container restarts and state, and the container logs and errors.

  • Service-level metrics: These metrics provide information about the individual services in the cluster, such as the service name, namespace, and labels, the service endpoints and ports, the service requests and responses, the service latency and error rate, and the service events and errors.

What is Grafana?

Grafana is also well-known because of its dashboarding capabilities, it complements Prometheus by providing insightful visual representation of the data collected.

When connected to Prometheus, Grafana can offer the following benefits:

  • Visualize Prometheus metrics: You can use Grafana to visualize the metrics collected by Prometheus from your applications and infrastructure. You can choose from different types of charts, such as graphs, gauges, heatmaps, and histograms, to display your data in a meaningful way.

  • Query Prometheus data: You can use Grafana to query and manipulate the metrics stored by Prometheus using PromQL, a powerful query language for Prometheus. You can perform various operations, such as filtering, aggregation, computation, and transformation, on your data using PromQL expressions.

  • Create Prometheus alerts: You can use Grafana to create and manage alerts based on Prometheus metrics. You can define alert rules using PromQL expressions, and configure alert notifications to various channels, such as email, Slack, or webhooks. You can also view the alert status and history on Grafana.

  • Import Prometheus dashboards: You can use Grafana to import existing dashboards for Prometheus from the Grafana community or the Grafana Labs website. You can also export and share your own dashboards with others using Grafana.

Step 1 - Install Helm

Using apt-get, install Helm.

sudo apt-get install helm

Step 2 - Add Prometheus Helm Chart

To check for the latest version of prometheus, you can go the ArtifactHub repository and search for "Prometheus".

Next, use helm to get the latest Prometheus Helm Chart.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

Then, update the repo.

helm repo update

You have now successfully downloaded the latest version of Prometheus.

Step 3 - Install Prometheus Helm Chat On The Kubernetes Cluster

Now to install the Prometheus Helm Chart on the Kubernetes Cluster, run the following command.

helm install prometheus prometheus-community/prometheus

Once you have installed prometheus on the Kubernetes Cluster, check the deployed kubernetes resources to verify if that prometheus is running.

kubectl get all

Output:

Step 4 - Start and Expose Prometheus Service

After you have successfully installed the Prometheus helm Chart on Kubernetes, the next step is to launch the Prometheus Kubernetes application.

List the Kubernetes Services for Prometheus

kubectl get service

Output:

You will use the prometheus-server to access the prometheus application. You can access the the prometheus-server in the Kubernetes cluster. to access it outside the Kubernetes cluster you need to expose the Kubernetes service, this will generate a URL that can be used to access the application on a browser.

Expose the prometheus-server Kubernetes service.

kubectl expose service prometheus-server --type=NodePort --target-port=9090 --name=prometheus-server-ext

The ClusterIP is converted to the NodePort type, making the prometheus-server accessible on port 9090.

Get the prometheus-server Kubernetes service URL.

minikube service prometheus-server-ext

Output:

Wait a few minutes for the URL to be made available. Then access the URL using a browser like Chrome.

Step 5 - Add Grafana Charts

After you have successfully set up Prometheus, to install Grafana on the Kubernetes cluster, use the following steps. Check for the latest version of Grafana, go the ArtifactHub repository and search for "Grafana".

Next, use helm to get the latest Grafana Helm Chart.

helm repo add grafana https://grafana.github.io/helm-charts

Then update the repo.

helm repo update

Step 6 - Install Grafana Helm Chat On The Kubernetes Cluster

Next, install Grafana on the Kubernetes cluster.

helm install grafana grafana/grafana

Output:

Step 7 - Expose the Grafana Kubernetes Service

After you have successfully installed Grafana on the Kubernetes Cluster, check if Grafana is running by listing the Kubernetes services for Grafana.

To get all the Kubernetes Services for Grafana, run this command:

kubectl get service

Just like the Prometheus Kubernetes service, you will need to convert the grafana kubernetes service from the ClusterIP type to the NodePort type. This will allow grafana to accessible outside the Kubernetes cluster.

Expose the grafana Kubernetes service.

kubectl expose service grafana --type=NodePort --target-port=3000 --name=grafana-ext

This exposes the grafana and makes it accessible on port 3000.

Get the Grafana application.

minikube service grafana-ext

Output:

Wait a few minutes for the URL to be made available. Then access the URL using a browser like Chrome.

As shown in the image above, you need the Admin password to access Grafana.

Run the following command to get the admin password.

kubectl get secret --namespace default grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

NOTE: You need to open a new terminal to run this process to leave the Grafana tunnel running.

Step 8 - Setup Grafana

After you have gotten the Admin password for Grafana, input admin as the username and then enter the password to access Grafana Homepage.

Next, it's time to add your data source and in this case - Prometheus.

On Grafana Homepage, click on Add your first data source, then select Prometheus.

In the Data Source setting for Prometheus, enter the URL where your Prometheus service is running.

Then click on the " save & test" button to save your changes.

You have now successfully integrated Prometheus and Grafana. It is now time to setup a grafana dashboard as you have the option to create one from scratch or to import one. But for this guide, we will be importing a Grafana dashboard.

To import a Grafana Dashboard, follow these steps:

Go to the Grafana dashboard Library.

Search for Kubernetes, and find the Kubernetes cluster monitoring with Prometheus dashboard and copy the Dashboard ID.

Go back to Grafana Homepage, and select "Dashboards".

Click on the "New" button and in the drop down menu select "Import".

Paste the dashboard ID you copied and click on the "Load" button.

Then choose a Prometheus Data Source and Click on "Import".

The Dashboard will be launched as shown below:

Deploying Prometheus and Grafana For Cloud-based Clusters

To deploy Prometheus and Grafana to production cloud-based clusters like Azure Kubernetes Service (AKS) or Google Kubernetes Engine (GKE) you need to do the following:

Step 1 - Create a Persistent Volume and a Persistent Volume Claim for Prometheus.

Persistent Volume resources help manage durable and permanent storage in a cluster.

Data in Persistent Volumes remains intact even when the kubernetes application, pods, application containers or the kubernetes cluster itself change or terminate.

Create a new file prometheus-persistent-volume.yaml and enter the following code:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus-pv
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /prometheus-data
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - <NODE_PROMETHEUS_RUNS>

Add the Persistent Volume to the cluster.

kubectl apply -f prometheus-persistent-volume.yaml

The code will persist the data between service restarts.

Persistent volume Claim for Prometheus

A pod needs to ask for data storage by creating a Persistent Volume Claim, which gets the real block storage. Persistent Volume Claim uses up Persistent Volume data resources.

Create a new file prometheus-persistent-volume-claim.yaml and enter the following code:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: prometheus-pvc
spec:
  storageClassName: local-storage
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

Add your Persistent Volume Claim to the Cluster.

kubectl apply -f prometheus-persistent-volume-claim.yaml

Step 2 - Create a Persistent Volume and a Persistent Volume Claim for Grafana.

To create a persistent volume for Grafana, create a file grafana-persistent-volume.yaml and add the following code:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: grafana-pv
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /grafana-data

  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - <NODE_GRAFANA_RUNS>

Next, add a Persistent volume Claim for Grafana create a new file grafana-persistent-volume-claim.yaml and add the following code.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-pvc
spec:
  storageClassName: local-storage
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

Add the Persistent Volume claim to your Cluster.

kubectl apply -f grafana-persistent-volume-claim.yaml -f grafana-persistent-volume.yaml

Next, update Prometheus to use the persistent storage.

helm upgrade prometheus prometheus-community/prometheus \
  --set server.persistentVolume.enabled=true \
  --set server.persistentVolume.storageClass=local-storage \
  --set server.persistentVolume.existingClaim=prometheus-pvc

Update Grafana to use the persistent storage also.

helm upgrade my-grafana grafana/grafana --set persistence.enabled=true,persistence.storageClassName="local-storage",persistence.existingClaim="grafana-pvc"

Set up the RBAC permissions with a ClusterRole before you launch Prometheus and Grafana in production. Next, you’ll link this ClusterRole to a ServiceAccount using a ClusterRoleBinding object.

Set up Prometheus RBAC Permissions:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default

Create a Prometheus Kubernetes service to expose the Prometheus application to external access.

Create a file prometheus-service.yaml, and add the following.

apiVersion: v1
kind: Service
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  ports:
  - name: web
    port: 9090
    targetPort: 80
  selector:
    app.kubernetes.io/name: prometheus
  sessionAffinity: ClientIP

Add the Prometheus service to your Cluster.

kubectl apply -f prometheus-service.yaml

Also create a Grafana service to expose the Grafana application to external access.

Create a file grafana-service.yaml, and add the following.

apiVersion: v1
kind: Service
metadata:
  name: grafana
  namespace: monitoring
  annotations:
      prometheus.io/scrape: 'true'
      prometheus.io/port:   '3000'
spec:
  selector: 
    app: grafana
  type: NodePort  
  ports:
    - port: 3000
      targetPort: 80
      nodePort: 32000

Add the Grafana service to your cluster.

kubectl apply -f grafana-service.yaml

If you follow these steps successfully, you should be able to run Prometheus and Grafana in production.

Conclusion

This article has provided a step-by-step guide on how to set up Prometheus and Grafana on a Kubernetes cluster and import dashboards for monitoring cluster performance and health.

Prometheus is a powerful time series database that can collect metrics from a variety of sources, including Kubernetes itself. Grafana is a data visualization tool that can be used to create dashboards and alerts based on the metrics collected by Prometheus.

By using Prometheus and Grafana, you can get a comprehensive view of the health and performance of your Kubernetes cluster. This can help you to identify and resolve problems quickly, and to ensure that your applications are running smoothly.

Additional Resources

The official Prometheus documentation

The official Grafana documentation

The Grafana Labs blog

The Grafana Labs community forum