DRANet: the fix for bare metal RDMA in Kubernetes
hostNetwork is the default recommendation for RDMA in Kubernetes. It breaks disaggregated inference. DRANet replaces it with DRA-based NIC assignment and fixes the problem cleanly.
hostNetwork is the default recommendation for RDMA in Kubernetes. It breaks disaggregated inference. DRANet replaces it with DRA-based NIC assignment and fixes the problem cleanly.
Running llm-d’s disaggregated prefill/decode architecture across an RTX 3060 and a Tesla T4 connected by 25GbE RDMA. What worked, what broke, and what I learned about KV cache transfer at the edge of what consumer hardware can do.