Skip to main content
Back to Case Studies
DevOps4 monthsCompleted

Building a Real-Time Kubernetes Operations Dashboard

How I built FlexDeck, a full-stack operations dashboard with real-time K8s monitoring, GitLab CI/CD visualization, and AI model management using Go and SolidJS.

November 1, 2025·3 min read
<100ms
Update Latency
real-time data refresh
3
Clusters
monitored simultaneously
99.9%
Uptime
over 6 months

Tech Stack

Backend
GoWebSocket
Frontend
SolidJSD3.js
Infrastructure
Kubernetes
Monitoring
Prometheus

Overview

Managing a homelab Kubernetes cluster with multiple workloads requires visibility into cluster state, CI/CD pipelines, and AI model deployments. I built FlexDeck to consolidate these views into a single real-time dashboard with interactive visualizations.

Diagram showing a GitOps-managed K3s cluster on Harvester with an ingress layer, a VM worker pool, and a dedicated GPU worker pool for inference workloads.
Figure 1. The baseline environment: GitOps-managed K3s with dedicated GPU workers for inference and media workloads.

The Challenge

Running a production-like homelab environment means dealing with:

  • Multiple clusters: K3s for applications, Harvester for infrastructure
  • GitLab CI/CD: Dozens of pipelines across 40+ repositories
  • AI workloads: GPU scheduling, model health, inference metrics
  • No commercial tooling budget: Datadog and similar tools are expensive for personal use

I needed a dashboard that would:

  1. Show real-time cluster state without manual refresh
  2. Visualize CI/CD pipelines with enough detail to debug failures
  3. Monitor GPU utilization and AI model health
  4. Work on both desktop and mobile for on-the-go checks

The Approach

Technology Choices

Backend: Go

  • Excellent Kubernetes client libraries
  • Low memory footprint for always-on service
  • Strong concurrency primitives for real-time streaming

Frontend: SolidJS

  • Fine-grained reactivity without Virtual DOM overhead
  • Excellent performance for frequent updates
  • Smaller bundle than React for faster mobile loads

Data streaming: WebSocket + Server-Sent Events

  • WebSocket for bidirectional communication (kubectl exec, logs)
  • SSE for unidirectional updates (cluster state, metrics)

Architecture

Diagram showing SolidJS clients connecting via WebSocket/SSE to a Go backend, which watches Kubernetes clusters and polls the GitLab API for CI/CD state.
Figure 2. FlexDeck treats infrastructure state as a stream: clients subscribe to a Go backend that watches clusters and CI/CD.

Implementation Details

Kubernetes Watch Streams

The Go backend uses informers to watch cluster resources efficiently:

type ClusterWatcher struct {
    clientset *kubernetes.Clientset
    informers informers.SharedInformerFactory
    updates   chan ResourceUpdate
}

func (w *ClusterWatcher) WatchPods(ctx context.Context) {
    informer := w.informers.Core().V1().Pods().Informer()

    informer.AddEventHandler(cache.ResourceEventHandlerFuncs{
        AddFunc: func(obj interface{}) {
            pod := obj.(*corev1.Pod)
            w.updates <- ResourceUpdate{
                Type:     "pod",
                Action:   "added",
                Resource: podToDTO(pod),
            }
        },
        UpdateFunc: func(old, new interface{}) {
            pod := new.(*corev1.Pod)
            w.updates <- ResourceUpdate{
                Type:     "pod",
                Action:   "updated",
                Resource: podToDTO(pod),
            }
        },
        DeleteFunc: func(obj interface{}) {
            pod := obj.(*corev1.Pod)
            w.updates <- ResourceUpdate{
                Type:     "pod",
                Action:   "deleted",
                Resource: podToDTO(pod),
            }
        },
    })
}

Fine-Grained Reactivity with SolidJS

SolidJS signals update only the specific DOM elements that need to change:

function PodCard(props: { pod: Pod }) {
  // Only re-renders when this specific pod's status changes
  const statusColor = () => {
    switch (props.pod.status) {
      case 'Running':
        return 'text-green-400';
      case 'Pending':
        return 'text-yellow-400';
      case 'Failed':
        return 'text-red-400';
      default:
        return 'text-gray-400';
    }
  };

  return (
    <div class="pod-card">
      <span class="pod-name">{props.pod.name}</span>
      <span class={`pod-status ${statusColor()}`}>{props.pod.status}</span>
      <Show when={props.pod.restarts > 0}>
        <span class="restart-count">Restarts: {props.pod.restarts}</span>
      </Show>
    </div>
  );
}

GitLab CI/CD Visualization

For CI/CD pipelines, I built a custom visualization using D3.js with particle effects showing job flow:

  • Pipelines displayed as directed graphs
  • Jobs animate between stages as they progress
  • Failed jobs pulse red for visibility
  • Click-to-expand for job logs

GPU Monitoring

The dashboard integrates with DCGM (NVIDIA) and ROCm metrics for GPU health:

type GPUMetrics struct {
    DeviceID       string  `json:"device_id"`
    Utilization    float64 `json:"utilization"`
    MemoryUsed     int64   `json:"memory_used"`
    MemoryTotal    int64   `json:"memory_total"`
    Temperature    int     `json:"temperature"`
    PowerDraw      float64 `json:"power_draw"`
    ActiveModel    string  `json:"active_model,omitempty"`
}

Results

Performance Metrics

MetricTargetAchieved
Initial load time<2s1.2s
Update latency<200ms<100ms
Memory usage (backend)<100MB65MB
Bundle size (frontend)<500KB380KB

Operational Impact

Before FlexDeck:

  • Switching between kubectl, Grafana, and GitLab UI constantly
  • Missing pipeline failures until builds broke
  • No mobile access to cluster state

After FlexDeck:

  • Single pane of glass for all operations
  • Real-time notifications for failures
  • Check cluster health from phone during incidents

Visualizations Built

  1. Cluster topology: Interactive node and pod layout
  2. Pipeline DAG: Directed graph with job status animations
  3. Resource utilization: Real-time charts for CPU/memory/GPU
  4. Namespace overview: Grid view with health indicators
  5. Model registry: AI model versions and deployment status

Lessons Learned

SolidJS for Real-Time UIs

The choice of SolidJS over React paid off significantly:

  • No batched updates: Changes appear instantly, not on next tick
  • No re-render cascades: Parent updates don't re-render children
  • Smaller runtime: 7KB vs React's 40KB+

The mental model is different (signals vs state), but for dashboards with many independent updating components, it's worth learning.

Kubernetes Informers vs Polling

Initially I polled the API server every 5 seconds. Switching to informers:

  • Reduced API server load by 95%
  • Eliminated "stale data" UX issues
  • Enabled instant status updates

WebSocket Reconnection

Real-time connections fail. Robust reconnection is essential:

func (c *Client) maintainConnection(ctx context.Context) {
    backoff := time.Second
    maxBackoff := time.Minute

    for {
        select {
        case <-ctx.Done():
            return
        default:
        }

        if err := c.connect(); err != nil {
            log.Printf("Connection failed: %v, retrying in %v", err, backoff)
            time.Sleep(backoff)
            backoff = min(backoff*2, maxBackoff)
            continue
        }

        backoff = time.Second // Reset on success
        c.handleMessages(ctx)
    }
}

Future Improvements

  • Multi-tenant support: Share dashboard with team members
  • Alert integration: PagerDuty/Slack notifications from dashboard
  • Cost tracking: Integrate with cloud billing APIs
  • Capacity planning: Predictive scaling recommendations

Conclusion

Building a custom operations dashboard was more work than using off-the-shelf tools, but the result is exactly what I need: fast, focused, and free. The combination of Go's efficiency and SolidJS's reactivity creates a dashboard that feels instant, even monitoring multiple clusters simultaneously.

For homelabbers and small teams, this approach provides commercial-grade visibility without commercial-grade costs.

Interested in similar solutions?

Let's discuss how I can help with your project.