DevOps4 monthsCompleted

Building a Real-Time Kubernetes Operations Dashboard

How I built FlexDeck, a full-stack operations dashboard with real-time K8s monitoring, GitLab CI/CD visualization, and AI model management using Go and SolidJS.

November 1, 2025·3 min read

<100ms

Update Latency

real-time data refresh

Clusters

monitored simultaneously

99.9%

Uptime

over 6 months

Tech Stack

Backend

GoWebSocket

Frontend

SolidJSD3.js

Infrastructure

Kubernetes

Monitoring

Prometheus

Overview

Managing a homelab Kubernetes cluster with multiple workloads requires visibility into cluster state, CI/CD pipelines, and AI model deployments. I built FlexDeck to consolidate these views into a single real-time dashboard with interactive visualizations.

Diagram showing a GitOps-managed K3s cluster on Harvester with an ingress layer, a VM worker pool, and a dedicated GPU worker pool for inference workloads. — **Figure 1.** The baseline environment: GitOps-managed K3s with dedicated GPU workers for inference and media workloads.

The Challenge

Running a production-like homelab environment means dealing with:

Multiple clusters: K3s for applications, Harvester for infrastructure
GitLab CI/CD: Dozens of pipelines across 40+ repositories
AI workloads: GPU scheduling, model health, inference metrics
No commercial tooling budget: Datadog and similar tools are expensive for personal use

I needed a dashboard that would:

Show real-time cluster state without manual refresh
Visualize CI/CD pipelines with enough detail to debug failures
Monitor GPU utilization and AI model health
Work on both desktop and mobile for on-the-go checks

The Approach

Technology Choices

Backend: Go

Excellent Kubernetes client libraries
Low memory footprint for always-on service
Strong concurrency primitives for real-time streaming

Frontend: SolidJS

Fine-grained reactivity without Virtual DOM overhead
Excellent performance for frequent updates
Smaller bundle than React for faster mobile loads

Data streaming: WebSocket + Server-Sent Events

WebSocket for bidirectional communication (kubectl exec, logs)
SSE for unidirectional updates (cluster state, metrics)

Architecture

Diagram showing SolidJS clients connecting via WebSocket/SSE to a Go backend, which watches Kubernetes clusters and polls the GitLab API for CI/CD state. — **Figure 2.** FlexDeck treats infrastructure state as a stream: clients subscribe to a Go backend that watches clusters and CI/CD.

Implementation Details

Kubernetes Watch Streams

The Go backend uses informers to watch cluster resources efficiently:

type ClusterWatcher struct {
    clientset *kubernetes.Clientset
    informers informers.SharedInformerFactory
    updates   chan ResourceUpdate
}

func (w *ClusterWatcher) WatchPods(ctx context.Context) {
    informer := w.informers.Core().V1().Pods().Informer()

    informer.AddEventHandler(cache.ResourceEventHandlerFuncs{
        AddFunc: func(obj interface{}) {
            pod := obj.(*corev1.Pod)
            w.updates <- ResourceUpdate{
                Type:     "pod",
                Action:   "added",
                Resource: podToDTO(pod),
            }
        },
        UpdateFunc: func(old, new interface{}) {
            pod := new.(*corev1.Pod)
            w.updates <- ResourceUpdate{
                Type:     "pod",
                Action:   "updated",
                Resource: podToDTO(pod),
            }
        },
        DeleteFunc: func(obj interface{}) {
            pod := obj.(*corev1.Pod)
            w.updates <- ResourceUpdate{
                Type:     "pod",
                Action:   "deleted",
                Resource: podToDTO(pod),
            }
        },
    })
}

Fine-Grained Reactivity with SolidJS

SolidJS signals update only the specific DOM elements that need to change:

function PodCard(props: { pod: Pod }) {
  // Only re-renders when this specific pod's status changes
  const statusColor = () => {
    switch (props.pod.status) {
      case 'Running':
        return 'text-green-400';
      case 'Pending':
        return 'text-yellow-400';
      case 'Failed':
        return 'text-red-400';
      default:
        return 'text-gray-400';
    }
  };

  return (
    <div class="pod-card">
      <span class="pod-name">{props.pod.name}</span>
      <span class={`pod-status ${statusColor()}`}>{props.pod.status}</span>
      <Show when={props.pod.restarts > 0}>
        <span class="restart-count">Restarts: {props.pod.restarts}</span>
      </Show>
    </div>
  );
}

GitLab CI/CD Visualization

For CI/CD pipelines, I built a custom visualization using D3.js with particle effects showing job flow:

Pipelines displayed as directed graphs
Jobs animate between stages as they progress
Failed jobs pulse red for visibility
Click-to-expand for job logs

GPU Monitoring

The dashboard integrates with DCGM (NVIDIA) and ROCm metrics for GPU health:

type GPUMetrics struct {
    DeviceID       string  `json:"device_id"`
    Utilization    float64 `json:"utilization"`
    MemoryUsed     int64   `json:"memory_used"`
    MemoryTotal    int64   `json:"memory_total"`
    Temperature    int     `json:"temperature"`
    PowerDraw      float64 `json:"power_draw"`
    ActiveModel    string  `json:"active_model,omitempty"`
}

Results

Performance Metrics

Metric	Target	Achieved
Initial load time	<2s	1.2s
Update latency	<200ms	<100ms
Memory usage (backend)	<100MB	65MB
Bundle size (frontend)	<500KB	380KB

Operational Impact

Before FlexDeck:

Switching between kubectl, Grafana, and GitLab UI constantly
Missing pipeline failures until builds broke
No mobile access to cluster state

After FlexDeck:

Single pane of glass for all operations
Real-time notifications for failures
Check cluster health from phone during incidents

Visualizations Built

Cluster topology: Interactive node and pod layout
Pipeline DAG: Directed graph with job status animations
Resource utilization: Real-time charts for CPU/memory/GPU
Namespace overview: Grid view with health indicators
Model registry: AI model versions and deployment status

Lessons Learned

SolidJS for Real-Time UIs

The choice of SolidJS over React paid off significantly:

No batched updates: Changes appear instantly, not on next tick
No re-render cascades: Parent updates don't re-render children
Smaller runtime: 7KB vs React's 40KB+

The mental model is different (signals vs state), but for dashboards with many independent updating components, it's worth learning.

Kubernetes Informers vs Polling

Initially I polled the API server every 5 seconds. Switching to informers:

Reduced API server load by 95%
Eliminated "stale data" UX issues
Enabled instant status updates

WebSocket Reconnection

Real-time connections fail. Robust reconnection is essential:

func (c *Client) maintainConnection(ctx context.Context) {
    backoff := time.Second
    maxBackoff := time.Minute

    for {
        select {
        case <-ctx.Done():
            return
        default:
        }

        if err := c.connect(); err != nil {
            log.Printf("Connection failed: %v, retrying in %v", err, backoff)
            time.Sleep(backoff)
            backoff = min(backoff*2, maxBackoff)
            continue
        }

        backoff = time.Second // Reset on success
        c.handleMessages(ctx)
    }
}

Future Improvements

Multi-tenant support: Share dashboard with team members
Alert integration: PagerDuty/Slack notifications from dashboard
Cost tracking: Integrate with cloud billing APIs
Capacity planning: Predictive scaling recommendations

Conclusion

Building a custom operations dashboard was more work than using off-the-shelf tools, but the result is exactly what I need: fast, focused, and free. The combination of Go's efficiency and SolidJS's reactivity creates a dashboard that feels instant, even monitoring multiple clusters simultaneously.

For homelabbers and small teams, this approach provides commercial-grade visibility without commercial-grade costs.

Interested in similar solutions?

Let's discuss how I can help with your project.

Get in Touch View More Case Studies