Kubernetes Operators represent a powerful pattern for managing complex applications on Kubernetes. They extend the Kubernetes API to create, configure, and manage instances of stateful applications on behalf of Kubernetes users.
What are Operators?
Operators are software extensions to Kubernetes that use custom resources to manage applications and their components. They follow Kubernetes principles, notably the control loop concept, to automate operational tasks that would typically require human intervention.
The Operator Pattern
The Operator pattern captures how you can write code to automate a task beyond what Kubernetes itself provides. It combines:
- Custom Resource Definitions (CRDs): Extend the Kubernetes API with application-specific resources
- Custom Controllers: Implement the control loop that watches and reconciles the desired state
- Operational Knowledge: Encode human operational expertise into software
Why Use Operators?
Operators are particularly useful for:
- Stateful Applications: Databases, message queues, and other stateful systems
- Complex Deployment Procedures: Applications requiring multi-step installation/configuration
- Day-2 Operations: Backup, restore, scaling, upgrades, and failure recovery
- Domain-Specific Knowledge: Encoding operational expertise into automation
How Operators Work
Operators follow this basic workflow:
- Watch for changes to custom resources
- Compare the current state with the desired state
- Take action to reconcile any differences
- Update the status of the custom resource
- Repeat the process continuously
Popular Operators
Many popular applications have operators available:
- Prometheus Operator: Manages Prometheus monitoring instances
- Elasticsearch Operator: Manages Elasticsearch clusters
- PostgreSQL Operator: Manages PostgreSQL databases
- Redis Operator: Manages Redis clusters
- Cert-Manager: Manages TLS certificates
Building a Custom Operator
You can build operators using various frameworks and tools. The most popular approaches are:
- Kubebuilder: SDK for building Kubernetes APIs using CRDs
- Operator SDK: Framework for building operators (now part of Kubebuilder)
- KUDO: Kubernetes Universal Declarative Operator
- Metacontroller: Lightweight way to write controllers
- Native Go client: Using client-go and other Kubernetes client libraries
Step 1: Define a Custom Resource
First, define a Custom Resource Definition (CRD) that extends the Kubernetes API:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: webapps.example.com
spec:
group: example.com
names:
kind: WebApp
listKind: WebAppList
plural: webapps
singular: webapp
scope: Namespaced
versions:
- name: v1alpha1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
replicas:
type: integer
minimum: 1
maximum: 5
image:
type: string
port:
type: integer
status:
type: object
properties:
availableReplicas:
type: integer
conditions:
type: array
items:
type: object
properties:
type:
type: string
status:
type: string
message:
type: string
Step 2: Create API Types
Define Go types that represent your custom resource:
package v1alpha1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// WebAppSpec defines the desired state of WebApp
type WebAppSpec struct {
Replicas int32 `json:"replicas"`
Image string `json:"image"`
Port int32 `json:"port"`
}
// WebAppStatus defines the observed state of WebApp
type WebAppStatus struct {
AvailableReplicas int32 `json:"availableReplicas"`
Conditions []metav1.Condition `json:"conditions,omitempty"`
}
//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
// WebApp is the Schema for the webapps API
type WebApp struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec WebAppSpec `json:"spec,omitempty"`
Status WebAppStatus `json:"status,omitempty"`
}
//+kubebuilder:object:root=true
// WebAppList contains a list of WebApp
type WebAppList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []WebApp `json:"items"`
}
func init() {
SchemeBuilder.Register(&WebApp{}, &WebAppList{})
}
Step 3: Implement the Controller
Create a controller that reconciles the desired state:
package controllers
import (
"context"
"fmt"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/types"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
"sigs.k8s.io/controller-runtime/pkg/log"
webappv1alpha1 "github.com/example/webapp-operator/api/v1alpha1"
)
// WebAppReconciler reconciles a WebApp object
type WebAppReconciler struct {
client.Client
Scheme *runtime.Scheme
}
//+kubebuilder:rbac:groups=webapp.example.com,resources=webapps,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=webapp.example.com,resources=webapps/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=webapp.example.com,resources=webapps/finalizers,verbs=update
//+kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=services,verbs=get;list;watch;create;update;patch;delete
// Reconcile is the main control loop function
func (r *WebAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := log.FromContext(ctx)
// Fetch the WebApp instance
webapp := &webappv1alpha1.WebApp{}
err := r.Get(ctx, req.NamespacedName, webapp)
if err != nil {
if errors.IsNotFound(err) {
// Request object not found, could have been deleted after reconcile request
return ctrl.Result{}, nil
}
// Error reading the object
return ctrl.Result{}, err
}
// Check if the Deployment already exists, if not create a new one
deployment := &appsv1.Deployment{}
err = r.Get(ctx, types.NamespacedName{Name: webapp.Name, Namespace: webapp.Namespace}, deployment)
if err != nil && errors.IsNotFound(err) {
// Define a new Deployment
dep := r.deploymentForWebApp(webapp)
log.Info("Creating a new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
err = r.Create(ctx, dep)
if err != nil {
log.Error(err, "Failed to create new Deployment", "Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
return ctrl.Result{}, err
}
// Deployment created successfully
return ctrl.Result{Requeue: true}, nil
} else if err != nil {
log.Error(err, "Failed to get Deployment")
return ctrl.Result{}, err
}
// Ensure the deployment size is the same as the spec
size := webapp.Spec.Replicas
if *deployment.Spec.Replicas != size {
deployment.Spec.Replicas = &size
err = r.Update(ctx, deployment)
if err != nil {
log.Error(err, "Failed to update Deployment", "Deployment.Namespace", deployment.Namespace, "Deployment.Name", deployment.Name)
return ctrl.Result{}, err
}
}
// Check if the Service already exists, if not create a new one
service := &corev1.Service{}
err = r.Get(ctx, types.NamespacedName{Name: webapp.Name, Namespace: webapp.Namespace}, service)
if err != nil && errors.IsNotFound(err) {
// Define a new Service
svc := r.serviceForWebApp(webapp)
log.Info("Creating a new Service", "Service.Namespace", svc.Namespace, "Service.Name", svc.Name)
err = r.Create(ctx, svc)
if err != nil {
log.Error(err, "Failed to create new Service", "Service.Namespace", svc.Namespace, "Service.Name", svc.Name)
return ctrl.Result{}, err
}
// Service created successfully
return ctrl.Result{Requeue: true}, nil
} else if err != nil {
log.Error(err, "Failed to get Service")
return ctrl.Result{}, err
}
// Update the WebApp status with the available replicas
webapp.Status.AvailableReplicas = deployment.Status.AvailableReplicas
err = r.Status().Update(ctx, webapp)
if err != nil {
log.Error(err, "Failed to update WebApp status")
return ctrl.Result{}, err
}
return ctrl.Result{}, nil
}
// deploymentForWebApp returns a WebApp Deployment object
func (r *WebAppReconciler) deploymentForWebApp(w *webappv1alpha1.WebApp) *appsv1.Deployment {
labels := labelsForWebApp(w.Name)
replicas := w.Spec.Replicas
dep := &appsv1.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: w.Name,
Namespace: w.Namespace,
},
Spec: appsv1.DeploymentSpec{
Replicas: &replicas,
Selector: &metav1.LabelSelector{
MatchLabels: labels,
},
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: labels,
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{{
Image: w.Spec.Image,
Name: "webapp",
Ports: []corev1.ContainerPort{{
ContainerPort: w.Spec.Port,
Name: "http",
}},
}},
},
},
},
}
// Set WebApp instance as the owner and controller
controllerutil.SetControllerReference(w, dep, r.Scheme)
return dep
}
// serviceForWebApp returns a WebApp Service object
func (r *WebAppReconciler) serviceForWebApp(w *webappv1alpha1.WebApp) *corev1.Service {
labels := labelsForWebApp(w.Name)
svc := &corev1.Service{
ObjectMeta: metav1.ObjectMeta{
Name: w.Name,
Namespace: w.Namespace,
},
Spec: corev1.ServiceSpec{
Selector: labels,
Ports: []corev1.ServicePort{
{
Port: w.Spec.Port,
NodePort: 30000, // Optional: use specific node port or let Kubernetes assign
},
},
Type: corev1.ServiceTypeNodePort,
},
}
// Set WebApp instance as the owner and controller
controllerutil.SetControllerReference(w, svc, r.Scheme)
return svc
}
// labelsForWebApp returns the labels for selecting the resources
func labelsForWebApp(name string) map[string]string {
return map[string]string{"app": "webapp", "webapp_cr": name}
}
// SetupWithManager sets up the controller with the Manager
func (r *WebAppReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&webappv1alpha1.WebApp{}).
Owns(&appsv1.Deployment{}).
Owns(&corev1.Service{}).
Complete(r)
}
Step 4: Build and Deploy the Operator
Create Dockerfile and build the operator image:
# Dockerfile
FROM golang:1.19 as builder
WORKDIR /workspace
COPY go.mod go.mod
COPY go.sum go.sum
RUN go mod download
COPY . .
RUN make manager
FROM gcr.io/distroless/static:nonroot
WORKDIR /
COPY --from=builder /workspace/bin/manager .
USER 65532:65532
ENTRYPOINT ["/manager"]
Create deployment manifests:
# config/manager/manager.yaml
apiVersion: v1
kind: Namespace
metadata:
name: webapp-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: controller-manager
namespace: webapp-system
spec:
replicas: 1
selector:
matchLabels:
control-plane: controller-manager
template:
metadata:
labels:
control-plane: controller-manager
spec:
containers:
- command:
- /manager
args:
- --leader-elect
image: controller:latest
name: manager
resources:
limits:
cpu: 100m
memory: 30Mi
requests:
cpu: 100m
memory: 20Mi
terminationGracePeriodSeconds: 10
Step 5: Deploy and Use the Operator
Deploy the operator and create a WebApp custom resource:
# Deploy the CRD
kubectl apply -f config/crd/bases/webapp.example.com_webapps.yaml
# Deploy the operator
kubectl apply -f config/manager/manager.yaml
# Create a WebApp instance
apiVersion: webapp.example.com/v1alpha1
kind: WebApp
metadata:
name: example-webapp
spec:
replicas: 3
image: nginx:latest
port: 80
Operator Best Practices
Design Considerations
- Make your operator idempotent - it should handle multiple reconciliations safely
- Implement proper error handling and backoff strategies
- Use finalizers for proper resource cleanup
- Provide comprehensive status information
- Support multiple versions of your custom resource with conversion webhooks
Testing
- Write unit tests for your reconciliation logic
- Use envtest for integration testing with a real API server
- Implement end-to-end tests with kind or minikube
- Test upgrade paths and backward compatibility
Security
- Follow the principle of least privilege for RBAC permissions
- Run your operator with a non-root user
- Secure your operator's communication with TLS
- Regularly update dependencies for security patches
Operator Frameworks Comparison
| Framework | Language | Learning Curve | Best For |
|---|---|---|---|
| Kubebuilder | Go | Moderate | Complex operators, full control |
| Operator SDK | Go, Ansible, Helm | Low to Moderate | Various skill levels, multiple approaches |
| KUDO | YAML/Declarative | Low | Simple operators, declarative approach |
| Metacontroller | Any language | Low | Simple webhook-based operators |
Kubernetes Operators represent a powerful pattern for managing complex applications on Kubernetes. By encoding operational knowledge into software, they can dramatically reduce the operational burden of running stateful applications while making them more reliable and easier to manage.
.png)
0 Comments