Monitor this! (Kubernetes)

Why you should monitor? (Start with Why)

What should be monitored? (The What)

  1. Cluster and Pod health. Clusters can end up with nodes in not ready state. Often culprit, out of disk, memory, vcpus. Similarly Pods can end up in unschedulable state or failure to create(missing dependencies in a startup script).
  2. Cluster capacity planning. A set of charts, that compares current resource usage to the maximum available by namespaces. (Memory, CPU)
  3. Microservices health checks. Includes uptime SLI, response time history.
  4. Distributed cache: Includes uptime, clients, hits/misses per second (too many of that means a not integral database). Since we used distributed locking with Redis, a lock released to the acquired ratio can point you to a bottleneck in your system.
  5. Resource usage by namespace: Cpu and memory usage by services in a namespace. In an EFK based logging namespace, it pointed out elasticsearch as a culprit of using too much memory for us.
  6. API Gateway: A comparison of successful (20x) vs server-error (50x) by service can tell you of faulty service.

Alerting

Monitoring in Kubernetes ( The How )

  1. Exporting.
  2. Collecting.
  3. Querying / Visualizing.

Summary

--

--

--

Bikes, Tea, Sunset, IndieMusic in that order. Software Engineer who fell in love with cloud-native infrastructure.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Resource Oriented Architecture and Design

[Golang] Ways to optimizing your Go Code — Profiling

Contextual Parser in Spark NLP: Extracting Medical Entities Contextually

Google My Business API Application — using Python — Part 1

ELK vs Prometheus/Grafana/Jaeger

Release ScrcpyHub v1.3.0

minerstat mining tutorial #13: Worker configuration

GCP Kubernetes GKE -Setting up Hashi Vault with KMS encrypted Storage Bucket Backend

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
ADIL RAFIQ

ADIL RAFIQ

Bikes, Tea, Sunset, IndieMusic in that order. Software Engineer who fell in love with cloud-native infrastructure.

More from Medium

How to write YAML file for Kubernetes

Running Minio as a pod in Kubernetes

Debugging Python-Based Microservices Running on a Remote Kubernetes Cluster

Multiple Schedulers in Kubernetes