Containerized applications often need to persist data beyond the lifecycle of individual containers. Kubernetes provides a powerful storage subsystem that abstracts away the underlying storage infrastructure and provides a consistent API for managing persistent storage.
Persistent Volumes and Persistent Volume Claims
Kubernetes uses two complementary resources to manage storage: PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs). This abstraction separates the concerns of storage administration from storage consumption.
PersistentVolumes (PVs)
A PersistentVolume is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a cluster resource, just like a node.
PVs are volume plugins like Volumes but have a lifecycle independent of any individual Pod. They can be backed by various storage technologies including:
- NFS
- iSCSI
- Cloud provider storage (AWS EBS, GCE Persistent Disk, Azure Disk)
- Local storage
- And many more
Example PersistentVolume Definition
apiVersion: v1 kind: PersistentVolume metadata: name: pv-nfs labels: type: nfs spec: capacity: storage: 10Gi volumeMode: Filesystem accessModes: - ReadWriteMany persistentVolumeReclaimPolicy: Retain storageClassName: nfs nfs: server: nfs-server.example.com path: "/exports"
PersistentVolumeClaims (PVCs)
A PersistentVolumeClaim is a request for storage by a user. It is similar to a Pod: Pods consume node resources, while PVCs consume PV resources.
Users request specific sizes and access modes without needing to know the details about the underlying storage infrastructure.
Example PersistentVolumeClaim Definition
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: my-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: fast
Using PVCs in Pods
Once a PVC is created, it can be mounted into a Pod as a volume:
apiVersion: v1 kind: Pod metadata: name: mypod spec: containers: - name: myfrontend image: nginx volumeMounts: - mountPath: "/var/www/html" name: mypd volumes: - name: mypd persistentVolumeClaim: claimName: my-pvc
Storage Classes and Dynamic Provisioning
Manually creating PVs for each storage need can be tedious. Storage Classes enable dynamic provisioning of storage volumes, allowing storage to be created on-demand.
StorageClass Resource
A StorageClass provides a way to describe the "classes" of storage available. Different classes might map to quality-of-service levels, backup policies, or arbitrary policies determined by the cluster administrators.
Example StorageClass Definition
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: fast provisioner: kubernetes.io/aws-ebs parameters: type: gp3 iops: "10000" throughput: "500" encrypted: "true" volumeBindingMode: WaitForFirstConsumer reclaimPolicy: Delete allowVolumeExpansion: true
Dynamic Provisioning
When a user creates a PVC that references a StorageClass, the appropriate provisioner automatically creates a PV that satisfies the claim. This eliminates the need for administrators to pre-provision storage.
The dynamic provisioning process:
- User creates a PVC with a specific StorageClass
- Kubernetes detects the PVC and triggers the provisioner
- The provisioner creates the actual storage resource
- A PV is automatically created and bound to the PVC
- The Pod can now use the PVC
Access Modes
PVs and PVCs support different access patterns:
- ReadWriteOnce (RWO): Read-write by a single node
- ReadOnlyMany (ROX): Read-only by many nodes
- ReadWriteMany (RWX): Read-write by many nodes
- ReadWriteOncePod (RWOP): Read-write by a single Pod (Kubernetes 1.22+)
Reclaim Policies
When a user is done with a volume, they can delete the PVC. The reclaim policy tells the cluster what to do with the PV after its claim is released:
- Retain: Manual reclamation (admin must manually clean up)
- Recycle: Basic scrub and make available again (deprecated)
- Delete: Associated storage asset is deleted
Volume Modes
Kubernetes supports two volume modes:
- Filesystem: Volume is formatted and mounted as a directory (default)
- Block: Volume is used as a raw block device
Common Use Cases
Stateful Applications
Databases (MySQL, PostgreSQL, MongoDB) and other stateful applications that require persistent storage.
Shared Storage
Applications that need to share data between multiple Pods (e.g., content management systems).
Data Processing
Workflows that process large datasets and need to persist intermediate or final results.
Best Practices
- Use StorageClasses for dynamic provisioning to simplify storage management
- Choose the appropriate access mode for your use case
- Consider using volume snapshots for backup and recovery
- Monitor storage usage to avoid running out of capacity
- Use the "WaitForFirstConsumer" binding mode for topology-aware provisioning
- Implement proper security policies for sensitive data stored on volumes
Kubernetes storage abstractions provide a flexible and powerful way to manage persistent data for containerized applications, whether running on-premises or in the cloud.
0 Comments