Storage in Kubernetes

Containerized applications often need to persist data beyond the lifecycle of individual containers. Kubernetes provides a powerful storage subsystem that abstracts away the underlying storage infrastructure and provides a consistent API for managing persistent storage.

Persistent Volumes and Persistent Volume Claims

Kubernetes uses two complementary resources to manage storage: PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs). This abstraction separates the concerns of storage administration from storage consumption.

PersistentVolumes (PVs)

A PersistentVolume is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a cluster resource, just like a node.

PVs are volume plugins like Volumes but have a lifecycle independent of any individual Pod. They can be backed by various storage technologies including:

  • NFS
  • iSCSI
  • Cloud provider storage (AWS EBS, GCE Persistent Disk, Azure Disk)
  • Local storage
  • And many more

Example PersistentVolume Definition

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-nfs
  labels:
    type: nfs
spec:
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
  nfs:
    server: nfs-server.example.com
    path: "/exports"
    

PersistentVolumeClaims (PVCs)

A PersistentVolumeClaim is a request for storage by a user. It is similar to a Pod: Pods consume node resources, while PVCs consume PV resources.

Users request specific sizes and access modes without needing to know the details about the underlying storage infrastructure.

Example PersistentVolumeClaim Definition

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: fast
    

Using PVCs in Pods

Once a PVC is created, it can be mounted into a Pod as a volume:

apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
    - name: myfrontend
      image: nginx
      volumeMounts:
      - mountPath: "/var/www/html"
        name: mypd
  volumes:
    - name: mypd
      persistentVolumeClaim:
        claimName: my-pvc
    

Storage Classes and Dynamic Provisioning

Manually creating PVs for each storage need can be tedious. Storage Classes enable dynamic provisioning of storage volumes, allowing storage to be created on-demand.

StorageClass Resource

A StorageClass provides a way to describe the "classes" of storage available. Different classes might map to quality-of-service levels, backup policies, or arbitrary policies determined by the cluster administrators.

Example StorageClass Definition

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp3
  iops: "10000"
  throughput: "500"
  encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
allowVolumeExpansion: true
    

Dynamic Provisioning

When a user creates a PVC that references a StorageClass, the appropriate provisioner automatically creates a PV that satisfies the claim. This eliminates the need for administrators to pre-provision storage.

The dynamic provisioning process:

  1. User creates a PVC with a specific StorageClass
  2. Kubernetes detects the PVC and triggers the provisioner
  3. The provisioner creates the actual storage resource
  4. A PV is automatically created and bound to the PVC
  5. The Pod can now use the PVC

Access Modes

PVs and PVCs support different access patterns:

  • ReadWriteOnce (RWO): Read-write by a single node
  • ReadOnlyMany (ROX): Read-only by many nodes
  • ReadWriteMany (RWX): Read-write by many nodes
  • ReadWriteOncePod (RWOP): Read-write by a single Pod (Kubernetes 1.22+)

Reclaim Policies

When a user is done with a volume, they can delete the PVC. The reclaim policy tells the cluster what to do with the PV after its claim is released:

  • Retain: Manual reclamation (admin must manually clean up)
  • Recycle: Basic scrub and make available again (deprecated)
  • Delete: Associated storage asset is deleted

Volume Modes

Kubernetes supports two volume modes:

  • Filesystem: Volume is formatted and mounted as a directory (default)
  • Block: Volume is used as a raw block device

Common Use Cases

Stateful Applications

Databases (MySQL, PostgreSQL, MongoDB) and other stateful applications that require persistent storage.

Shared Storage

Applications that need to share data between multiple Pods (e.g., content management systems).

Data Processing

Workflows that process large datasets and need to persist intermediate or final results.

Best Practices

  • Use StorageClasses for dynamic provisioning to simplify storage management
  • Choose the appropriate access mode for your use case
  • Consider using volume snapshots for backup and recovery
  • Monitor storage usage to avoid running out of capacity
  • Use the "WaitForFirstConsumer" binding mode for topology-aware provisioning
  • Implement proper security policies for sensitive data stored on volumes

Kubernetes storage abstractions provide a flexible and powerful way to manage persistent data for containerized applications, whether running on-premises or in the cloud.

Post a Comment

0 Comments