Writing

Writing on healthcare interoperability, API ecosystems, and operational execution, plus occasional lab notes. For deeper implementation dives, see Case Studies.

Category:

Tags:

Showing all 10 posts

Finding the Real Context Ceiling: Needle-Benchmarking Forced RoPE Extrapolation

June 25, 2026·5 min read

Lab

Finding the Real Context Ceiling: Needle-Benchmarking Forced RoPE Extrapolation

A served model can load at 96K context and still be useless past 64K. Loading is not the same as staying coherent. Here is how we mapped the exact cliff with a progressive needle-in-haystack bench — and why the limit was the model, not the GPU.

flexinfervllmlong-contextrope+2 more

Read post

The First 90 Days: Introducing AI-Assisted Dev to a New Team

April 20, 2026·9 min read

Professional

The First 90 Days: Introducing AI-Assisted Dev to a New Team

How I would roll out AI-assisted development on a team that has not standardized: what to do in week one, what to earn the right to argue about later, and what almost always goes wrong.

agentsai-assisted-devadoptionteam-practice+1 more

Read post

A One-Page AI Usage Policy That Actually Works

April 20, 2026·8 min read

Professional

A One-Page AI Usage Policy That Actually Works

A short, adoptable AI usage policy for engineering teams: what to put on the page, what to leave off, and why the policy matters less than the habits it makes explicit.

ai-assisted-devpolicyteam-practicegovernance+1 more

Read post

Getting Gemma 4 Running on a Radeon 7900 XTX (with and without TurboQuant)

April 4, 2026·8 min read

Lab

Getting Gemma 4 Running on a Radeon 7900 XTX (with and without TurboQuant)

What it took to get Gemma 4 E4B serving cleanly on Radeon through FlexInfer: a stable TRITON lane on a 7900 XTX, an experimental TurboQuant long-context lane on a second node, and the GPTQ pipeline work still underway.

gemma4amdradeon7900xtx+6 more

Read post

Build Your Own Legs Before the Crutches Fail

March 9, 2026·14 min read

Professional

Build Your Own Legs Before the Crutches Fail

AI-assisted development is useful leverage, but only if you convert borrowed competence into real judgment before the support becomes a dependency.

ai-assisted-devengineeringagentsdeveloper-workflows

Read post

Standing Up a GPU-Ready Private AI Platform (Harvester + K3s + Flux + GitLab)

December 29, 2025·6 min read

Professional

Standing Up a GPU-Ready Private AI Platform (Harvester + K3s + Flux + GitLab)

Field notes from building and operating a small private GPU platform with Harvester, K3s, and a GitLab -> Flux delivery loop.

case-studyplatform-engineeringkubernetesk3s+10 more

Read post

Optimizing Real-Time Kubernetes Visualizations: From 25ms to 12ms Per Frame

December 25, 2025·7 min read

Lab

Optimizing Real-Time Kubernetes Visualizations: From 25ms to 12ms Per Frame

A deep dive into optimizing Canvas 2D and Three.js visualizations for Kubernetes dashboards, covering algorithmic complexity, memory management, and GPU-efficient rendering patterns.

performancethree.jscanvasd3+3 more

Read post