Understanding Kubernetes Architecture: A Comprehensive Guide

Kubernetes is an open-source platform designed to manage containerized workloads and services with ease. It supports both declarative configuration and automation, making it highly versatile and efficient. With a large and rapidly expanding ecosystem, numerous services, support options, and tools are available for Kubernetes.

The name "Kubernetes" comes from the Greek word for helmsman or pilot. The abbreviation "K8s" is derived from the eight letters between the "K" and the "s". Google open-sourced Kubernetes in 2014, drawing on over 15 years of experience in running production workloads at scale and incorporating best practices from the community.

Traditional Deployment Era: Initially, organizations deployed applications on physical servers without defined resource boundaries, leading to allocation issues. For instance, when multiple applications ran on a single server, one could monopolize resources, causing others to underperform. The workaround was to run each application on a separate server, but this approach was inefficient and costly, as it led to underutilized resources and high maintenance expenses for numerous physical servers.

Virtualized Deployment Era: To address the limitations of traditional deployments, virtualization was introduced. Virtualization allows multiple Virtual Machines (VMs) to run on a single physical server's CPU, isolating applications within VMs and enhancing security by preventing one application from accessing another's data. This approach improves resource utilization and scalability, making it easier to add or update applications. It also reduces hardware costs by presenting physical resources as a cluster of disposable virtual machines. Each VM operates as a complete machine, running its own operating system on top of the virtualized hardware.

Container Deployment Era: Containers offer a more lightweight alternative to VMs by sharing the host Operating System (OS) while maintaining isolated environments. Each container has its own filesystem, CPU share, memory, and process space, making them portable across different cloud and OS environments.

Containers provide several benefits:

  • Agile Application Creation and Deployment: Easier and more efficient to create container images compared to VM images.

  • Continuous Development, Integration, and Deployment: Enables reliable and frequent container image builds with quick rollbacks due to image immutability.

  • Dev and Ops Separation of Concerns: Application container images are created at build/release time, decoupling applications from the infrastructure.

  • Observability: Provides detailed OS-level metrics and application health signals.

  • Environmental Consistency: Ensures consistency across development, testing, and production environments, whether on a laptop or in the cloud.

  • Cloud and OS Distribution Portability: Runs on various OS distributions and cloud environments, including Ubuntu, RHEL, CoreOS, on-premises, and public clouds.

  • Application-Centric Management: Shifts focus from running an OS on virtual hardware to running an application on an OS using logical resources.

  • Microservices Architecture: Supports breaking applications into smaller, independent services that can be managed and deployed dynamically.

  • Resource Isolation: Ensures predictable application performance.

  • Resource Utilization: Achieves high efficiency and density in resource use.

Why Kubernetes?

Why You Need Kubernetes and What It Can Do?

Containers are an effective way to bundle and run applications, but managing them in a production environment can be challenging. Ensuring there is no downtime and handling container failures automatically are critical tasks that need a robust system. This is where Kubernetes excels. It provides a framework for running distributed systems resiliently, handling tasks such as scaling, failover, and deployment patterns.

What Kubernetes Provides:

  • Service Discovery and Load Balancing: Kubernetes can expose containers using DNS names or their own IP addresses. It balances network traffic to maintain stable deployments.

  • Storage Orchestration: Automatically mount storage systems of your choice, including local storages and public cloud providers.

  • Automated Rollouts and Rollbacks: Describe the desired state of your containers, and Kubernetes will adjust the actual state at a controlled rate, managing new container creation and resource adoption seamlessly.

  • Automatic Bin Packing: Allocate resources efficiently by specifying CPU and memory needs for containers. Kubernetes fits containers onto nodes for optimal resource usage.

  • Self-Healing: Automatically restarts failing containers, replaces them, kills unresponsive containers, and only advertises ready containers to clients.

  • Secret and Configuration Management: Securely store and manage sensitive information like passwords and tokens. Deploy and update secrets and configurations without rebuilding container images or exposing secrets.

  • Batch Execution: Manage batch and CI workloads, with the ability to replace failing containers.

  • Horizontal Scaling: Easily scale your application up or down via command, UI, or automatically based on CPU usage.

  • IPv4/IPv6 Dual-Stack: Allocate both IPv4 and IPv6 addresses to Pods and Services.

  • Designed for Extensibility: Add features to your Kubernetes cluster without altering the upstream source code.

Kubernetes simplifies the complex task of managing containerized applications, ensuring high availability, scalability, and efficient resource utilization.

Kubernetes Components:

When you deploy Kubernetes, you create a cluster consisting of worker machines called nodes that run containerized applications. Each cluster has at least one worker node. The worker nodes host Pods, which are components of the application workload. The control plane manages the worker nodes and Pods. In production, the control plane runs across multiple computers, and the cluster typically has multiple nodes, ensuring fault-tolerance and high availability.

This document outlines the essential components for a functional Kubernetes cluster.

Control Plane Components:

The control plane's components make global decisions about the cluster (e.g., scheduling) and respond to cluster events (e.g., starting a new pod when replicas are unsatisfied). These components can run on any machine in the cluster, but typically, setup scripts start them all on the same machine without running user containers there. For a highly available setup across multiple machines.

kube-apiserver The API server is a core component of the Kubernetes control plane that exposes the Kubernetes API, acting as the front end for the control plane. The main implementation is kube-apiserver, which is designed to scale horizontally by deploying multiple instances and balancing traffic among them.

etcd etcd is a consistent and highly-available key-value store used as the backing store for all Kubernetes cluster data. It is crucial to have a backup plan for the data stored in etcd.

kube-scheduler

The kube-scheduler is a control plane component that watches for newly created Pods without an assigned node and selects a node for them to run on. It considers factors such as resource requirements, hardware/software constraints, affinity/anti-affinity rules, data locality, inter-workload interference, and deadlines when making scheduling decisions.

kube-controller-manager

The kube-controller-manager is a control plane component that runs various controller processes. Although each controller operates as a separate process logically, they are compiled into a single binary and run together to simplify management.

Examples of Controllers:

  • Node Controller: Detects and responds to node failures.

  • Job Controller: Manages Job objects, creating Pods to complete one-off tasks.

  • EndpointSlice Controller: Creates EndpointSlice objects to link Services and Pods.

  • ServiceAccount Controller: Generates default ServiceAccounts for new namespaces.

cloud-controller-manager

The cloud-controller-manager is a Kubernetes control plane component that handles cloud-specific control logic. It integrates your cluster with your cloud provider's API and separates cloud-related components from those that interact solely with the Kubernetes cluster.

It runs controllers specific to your cloud provider and is not used in on-premises or local learning environments. Similar to the kube-controller-manager, it combines multiple control loops into a single binary, which can be scaled horizontally to enhance performance and fault tolerance.

Cloud-Dependent Controllers:

  • Node Controller: Verifies if a node has been deleted from the cloud provider when it stops responding.

  • Route Controller: Manages route setup in the cloud infrastructure.

  • Service Controller: Handles the creation, updating, and deletion of cloud provider load balancers.

Node Components:

Node components run on every node in a Kubernetes cluster, maintaining the running Pods and providing the Kubernetes runtime environment.

kubelet

The kubelet is an agent running on each node in the Kubernetes cluster. It ensures that containers specified in PodSpecs are running and healthy. The kubelet only manages containers created by Kubernetes and does not handle containers outside of its control.

kube-proxy

kube-proxy is a network proxy running on each node in the Kubernetes cluster. It implements part of the Kubernetes Service concept by maintaining network rules that enable communication to Pods from both within and outside the cluster. kube-proxy utilizes the operating system's packet filtering layer if available; otherwise, it handles traffic forwarding itself.

Container Runtime

The container runtime is the software responsible for running containers. It pulls container images from a registry, unpacks them, and runs the applications inside. Kubernetes supports several container runtimes, including Docker, containerd, and CRI-O. The runtime ensures that containers are isolated and have the necessary resources to operate efficiently.