What is Kubernetes and why do we want it?

100Days Resources

Video by Anais Urlichs
Add your blog posts, videos etc. related to the topic here!

Learning Resources

Example Notes

Kubernetes builds on Borg and the lessons learned by Google for operating Borg for over 15 years.

Borg was used by Google internally to manage its systems.

For more information here is the full paper https://research.google/pubs/pub43438/

What is Kubernetes?

Kubernetes is an open-source container Orchestration Framework.

At its root, it manages containers — to manage applications that are made of of containers- physical machines, virtual machines, hybrid environments

According to the kubernetes.io website, Kubernetes is:

"an open-source system for automating deployment, scaling, and management of containerized applications".

What problems does it solve

Following the trend from Monolithic to Microservices — traditionally, an application would be a Monolithic application — which requires the hardware to scale with the application. In comparison, Kubernetes deploys a large number of small web servers.
Containers are the perfect host for small self-contained applications
Applications comprised of 100s of containers — managing those with scripts can be really difficult and even impossible
Kubernetes helps us with the following: connecting containers across multiple hosts, scaling them, deploying applications without downtime, and service discovery among several other aspects

The benefits of splitting up your application from Monolithic into Microservices is that they are easier to maintain. For instance:

Instead of a large Apache web server with many httpd daemons responding to page requests, there would be many nginx servers, each responding.

Additionally, it allows us to separate matters of concerns within our application i.e. decoupling the architecture based on responsibilies.

Additionally, Kubernetes is an essential part of

Continuous Integration
Continuous Delivery

Orchestration tools such as Kubernetes offer:

High availability
Scalability: Applications have higher performance e.g. load time
Disaster Recovery: The architecture has to have a way to back-up the data and restore the state of the application at any point in time

How does the architecture actually look like:

You have a master node: Runs several Kubernetes processes that are necessary to run the container's processes — e.g. an API server ⇒ the entry point to the Kubernetes cluster (UI, API, CLI); then it needs to have a Controller Manager ⇒ keeps track of what is happening e.g. detects when a pod dies ⇒ detects state changes; and lastly, it contains a Scheduler ⇒ this ensures the Pod placement (more on pods later); etcd database ⇒ key-value storage that holds the state of the Kubernetes cluster at any point in time; and the last thing that is needed is the Virtual Network that spans across all the nodes in the cluster
And worker nodes: contains Containers of different applications; here is where the actual work is happening

Note that worker nodes are usually much bigger because they are running the containers. The master node will only run a selection of processes. Note now master node is named as control plane.

Each node will have multiple pods with containers running on them. 3 processes have to be present on all nodes

Container Runtime e.g. Docker
Kubelet; which is a process of Kubernetes itself that interacts with both, the container runtime and the node — it is responsible for taking our configuration and starting a pod inside the node

Usually a Kubernetes cluster has several nodes running in parallel.

nodes communicate with services between each other. Thus, the third process that has to be installed on every node is Kube Proxy that forwards requests between nodes and pods— the communication works in a performant way with low overhead. Requests are forwarded to the closest pod.

Kubernetes Concepts — Pods

Pod ⇒ the smallest unit that you, as a Kubernetes user will configure

Pods are a wrapper of a container; on each worker node, we will have multiple containers
Usually, you would have one application, one container per pod
Each Pod has its own IP address — thus, each pod is its own self-containing server. This allows pods to communicate with each other — through their internal IP addresses

Note that we don't actually create containers inside the Kubernetes cluster but we work with the pods that are an abstraction layer over the containers. Pods manage the containers without our intervention

In case a pod dies, it can be configured to recreate (as per restart policy) and it will get a new IP address.

A service is used as an alternative for a pods IP address. Resulting, there is a service that sits in front of each pod that abstracts away the dynamic IP address. Resulting, the lifecycle of a pod and the IP address are not tied to each other-

If a pod behind the service dies, it gets recreated.

A service has two main functions:

Providing an IP address to the pod(s)
It is a Loadbalancer (what the hack is that 😆 — more on this later)

How do we create those components

All configuration goes through the master node — UI, API, CLI, all talk to the API server within the master node — they send the configuration request to the API server
The configuration is usually in YAML format ⇒ a blue print for creating pods. The Kubernetes agents convert the YAML to JSON prior to persistence to the database.
The configuration of the requirements are in a declarative format. This will allow it to compare the desired state to the actual state (more on this later)

Developer Workflow:

Create Docker images based on your application
Use Docker and Kubernetes
A CI pipeline to build, test, and verify Docker images
"You must be able to perform rolling updates and rollbacks, and eventually tear down the resource when no longer needed." — the course

This requires flexible and easy to use network storage.

Internal Developer Platform (IDP) Book

What is Kubernetes and why do we want it?

100Days Resources

Learning Resources

Example Notes