Monitor this! (Kubernetes)

Why you should monitor? (Start with Why)

What should be monitored? (The What)

  1. Cluster and Pod health. Clusters can end up with nodes in not ready state. Often culprit, out of disk, memory, vcpus. Similarly Pods can end up in unschedulable state or failure to create(missing dependencies in a startup script).
  2. Cluster capacity planning. A set of charts, that compares current resource usage to the maximum available by namespaces. (Memory, CPU)
  3. Microservices health checks. Includes uptime SLI, response time history.
  4. Distributed cache: Includes uptime, clients, hits/misses per second (too many of that means a not integral database). Since we used distributed locking with Redis, a lock released to the acquired ratio can point you to a bottleneck in your system.
  5. Resource usage by namespace: Cpu and memory usage by services in a namespace. In an EFK based logging namespace, it pointed out elasticsearch as a culprit of using too much memory for us.
  6. API Gateway: A comparison of successful (20x) vs server-error (50x) by service can tell you of faulty service.

Alerting

Monitoring in Kubernetes ( The How )

  1. Exporting.
  2. Collecting.
  3. Querying / Visualizing.

Summary

--

--

--

Bikes, Tea, Sunset, IndieMusic in that order. Software Engineer who fell in love with cloud-native infrastructure.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

First 💯 people to join our Discord will receive the NFT “Chill Waffii #10”.

SQL Join Types

DOMAIN-DRIVEN DESIGN AND UNIT TEST IN NODE.JS

Leetcode — Longest Substring Without Repeating Characters — Medium

AWS S3 in Elixir with ExAws

Minimum Cost Climbing Stairs (Leetcode Problem)

How to Detect and Delete Unused AWS EBS Volumes Using a Lambda Function and CloudWatch

Tournament Announcement! Chess #3

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
ADIL RAFIQ

ADIL RAFIQ

Bikes, Tea, Sunset, IndieMusic in that order. Software Engineer who fell in love with cloud-native infrastructure.

More from Medium

Getting Started with Argo CD

Helm for Kubernetes. Handling secrets with SOPS

Kubectl Tip #2

Set up Kubernetes clusters in multiple clouds

AWS-GCP interconnect