Kubernetes Operators

Learn how to build and use Kubernetes Operators to automate complex application management. Understand the Operator pattern, Operator SDK, and how to create custom operators.

Table of Contents

1. Operator Overview

A Kubernetes Operator is a method of packaging, deploying, and managing a Kubernetes application. An Operator extends Kubernetes to automate the management of complex, stateful applications.

1.1 What Operators Do

  • Automate application lifecycle management
  • Handle complex operational tasks
  • Encode domain knowledge
  • Provide self-healing capabilities
  • Manage application upgrades
  • Handle backup and restore

1.2 When to Use Operators

Use Operators when:

  • Application requires complex operational knowledge
  • You need to automate repetitive tasks
  • Application has stateful components
  • You want to provide a Kubernetes-native API
  • Application needs domain-specific logic

2. Operator Pattern

The Operator pattern combines:

  • Custom Resources (CRs): Extend Kubernetes API
  • Custom Resource Definitions (CRDs): Define new resource types
  • Controllers: Watch and reconcile desired state

2.1 How Operators Work

  1. Define Custom Resource Definition (CRD)
  2. Create Custom Resource instances
  3. Operator watches for CR changes
  4. Operator reconciles actual state with desired state
  5. Operator creates/updates/deletes Kubernetes resources

2.2 Controller Pattern

Operators use the controller pattern:

  • Observe current state
  • Compare with desired state
  • Take action to reconcile differences
  • Repeat continuously

3. Custom Resources

Custom Resources extend the Kubernetes API to add new resource types.

3.1 Custom Resource Definition (CRD)

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  group: example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              databaseName:
                type: string
              replicas:
                type: integer
          status:
            type: object
            properties:
              phase:
                type: string
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database

3.2 Custom Resource Instance

apiVersion: example.com/v1
kind: Database
metadata:
  name: my-database
spec:
  databaseName: myapp
  replicas: 3
status:
  phase: Running

3.3 Working with CRs

# Create CRD
kubectl apply -f crd.yaml

# Create CR instance
kubectl apply -f database.yaml

# List CRs
kubectl get databases

# Describe CR
kubectl describe database my-database

# Delete CR
kubectl delete database my-database

4. Operator SDK

The Operator SDK provides tools to build, test, and package Operators.

4.1 Installing Operator SDK

# macOS
brew install operator-sdk

# Linux
export ARCH=$(case $(uname -m) in x86_64) echo -n amd64 ;; aarch64) echo -n arm64 ;; *) echo -n $(uname -m) ;; esac)
export OS=$(uname | awk '{print tolower($0)}')
export OPERATOR_SDK_DL_URL=https://github.com/operator-framework/operator-sdk/releases/download/v1.28.0
curl -LO ${OPERATOR_SDK_DL_URL}/operator-sdk_${OS}_${ARCH}
chmod +x operator-sdk_${OS}_${ARCH} && sudo mv operator-sdk_${OS}_${ARCH} /usr/local/bin/operator-sdk

# Verify installation
operator-sdk version

4.2 Operator Types

Operator SDK supports different types:

  • Go: Full control, most flexible
  • Ansible: Use Ansible playbooks
  • Helm: Wrap Helm charts

5. Building Operators

5.1 Create Go Operator

# Initialize operator project
operator-sdk init --domain example.com --repo github.com/example/database-operator

# Create API and controller
operator-sdk create api --group example --version v1 --kind Database --resource --controller

# This creates:
# - api/v1/database_types.go (CRD definition)
# - controllers/database_controller.go (controller logic)

5.2 Define CRD Spec

// api/v1/database_types.go
package v1

import (
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

type DatabaseSpec struct {
    DatabaseName string `json:"databaseName"`
    Replicas     int32  `json:"replicas"`
    Image         string `json:"image,omitempty"`
}

type DatabaseStatus struct {
    Phase   string `json:"phase,omitempty"`
    Message string `json:"message,omitempty"`
}

type Database struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`
    Spec              DatabaseSpec   `json:"spec,omitempty"`
    Status            DatabaseStatus `json:"status,omitempty"`
}

5.3 Implement Controller

// controllers/database_controller.go
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := log.FromContext(ctx)
    
    // Fetch Database instance
    database := &examplev1.Database{}
    if err := r.Get(ctx, req.NamespacedName, database); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }
    
    // Check if Deployment exists
    deployment := &appsv1.Deployment{}
    err := r.Get(ctx, types.NamespacedName{
        Name:      database.Name,
        Namespace: database.Namespace,
    }, deployment)
    
    if err != nil && errors.IsNotFound(err) {
        // Create Deployment
        dep := r.deploymentForDatabase(database)
        if err := r.Create(ctx, dep); err != nil {
            return ctrl.Result{}, err
        }
        return ctrl.Result{Requeue: true}, nil
    }
    
    // Update status
    database.Status.Phase = "Running"
    if err := r.Status().Update(ctx, database); err != nil {
        return ctrl.Result{}, err
    }
    
    return ctrl.Result{}, nil
}

5.4 Build and Deploy

# Generate CRD manifests
make manifests

# Build operator image
make docker-build IMG=example/database-operator:latest

# Push image
make docker-push IMG=example/database-operator:latest

# Deploy operator
make deploy IMG=example/database-operator:latest

# Install CRDs
make install

# Run operator locally
make run

6. Operator Lifecycle

6.1 Operator Lifecycle Manager (OLM)

OLM helps manage Operators and their dependencies:

  • Install and upgrade Operators
  • Manage dependencies
  • Handle Operator versions
  • Provide Operator catalogs

6.2 Operator Bundle

Operators are packaged as bundles containing:

  • CRDs
  • RBAC manifests
  • Deployment manifests
  • Metadata (CSV - ClusterServiceVersion)

6.3 Testing Operators

# Run unit tests
make test

# Run integration tests
make test-integration

# Run e2e tests
make test-e2e

# Build bundle
make bundle

# Validate bundle
operator-sdk bundle validate ./bundle

8. Best Practices

8.1 Idempotency

Ensure your Operator is idempotent - running it multiple times should produce the same result.

8.2 Error Handling

Implement proper error handling and retry logic. Use exponential backoff for transient errors.

8.3 Status Updates

Always update the status field to reflect the current state of the resource.

8.4 Resource Ownership

Set owner references so resources are garbage collected when the CR is deleted.

8.5 Finalizers

Use finalizers to ensure cleanup happens before resource deletion.

8.6 Testing

Write comprehensive tests: unit tests, integration tests, and e2e tests.

8.7 Documentation

Document your CRD schema, provide examples, and explain the Operator's behavior.

8.8 Versioning

Use versioning for your CRD and Operator. Support multiple CRD versions when needed.

Summary: Kubernetes Operators extend Kubernetes to automate complex application management. They use Custom Resources and Controllers to encode operational knowledge. Use Operator SDK to build Operators. Follow best practices for idempotency, error handling, and testing.

Post a Comment

0 Comments