GIE 1.4: the framework release (and what it means for llm-d)

Gateway API Inference Extension v1.4 landed with 101 commits from 54 contributors. The headline isn’t a single feature, it’s that GIE became a real plugin framework. Here’s what changed and why it matters if you’re building on top of it.

March 21, 2026 · 7 min · Sam Batschelet

Disaggregated Prefill/Decode on Consumer GPUs

Running llm-d’s disaggregated prefill/decode architecture across an RTX 3060 and a Tesla T4 connected by 25GbE RDMA. What worked, what broke, and what I learned about KV cache transfer at the edge of what consumer hardware can do.

March 14, 2026 · 10 min · Sam Batschelet