ByteDance Jakiro: Enabling RDMA and TCP over Virtual Private Cloud

Title:ByteDance Jakiro: Enabling RDMA and TCP over Virtual Private Cloud

Authors:Yirui Liu (ByteDance);Lidong Jiang;Deguo Li (ByteDance);Daxiang Kang;Zhaoyang Wei;Yuqi Chai;Bin Niu;Ke Lin;Xiaoning Ding;Jianwen Pi;Hao Luo

Introduction
This paper studies how to enable both RDMA (Remote Direct Memory Access) and TCP in a unified Virtual Private Cloud (VPC). The problem is important because tenants often require both high-performance RDMA and flexible TCP communication, but today’s cloud systems typically support only TCP in VPCs or require separate overlays for RDMA, which is costly and hard to manage. Existing RDMA virtualization solutions either add high latency by intercepting verbs or restrict application compatibility, thus falling short of cloud-scale needs.

Key idea and contribution:
The authors propose Jakiro, a new vNIC design that integrates RDMA (RoCEv2) and TCP into one unified VPC using VxLAN tunneling. Jakiro avoids intercepting RDMA verbs, thus maintaining application compatibility and enabling seamless co-deployment with optimization libraries. The key innovations include:

1.Bandwidth isolation via ECN-based signaling and a Distributed Hierarchical Token Bucket (DHTB) to ensure fair QoS between RDMA and TCP.

2.RC Queue Pair quota control using a flow-based table (RCFT) to limit the number of active QPs per vNIC without software interception.

3.Efficient offloading model that uses multiple synchronized match tables (NFT, RCFT, VLT) to reduce hardware overhead and synchronization latency.

Evaluation
Jakiro is implemented on commodity RNICs (NVIDIA BF3, CX7) and evaluated with benchmarks and real workloads. Results show that overlay RoCEv2 achieves latency within 1.28μs of physical RDMA and throughput nearly identical to bare-metal. Jakiro also ensures weighted fairness between TCP and RDMA, and improves HPC and ML training performance significantly compared to TCP-only VPCs. It has been deployed in ByteDance Cloud for one year, powering HPC, ML training, and LLM inference. This result is significant because it demonstrates that cloud providers can deliver both performance and flexibility without costly dual-network infrastructures.

Q: In the case where there are two tanks/turnouts, should one QP be divided into two QPs, or can they simply be merged into a single QP? Additionally, when the QP is separated into VIP and desktop VIP, how should the situation be handled if there are two turnouts?

A: No clear conclusion can be given at this stage. This situation may require further discussion. It is suggested to try it first during the testing phase and then decide later if a more formal approach is necessary.

Personal thoughts
I like that Jakiro takes a pragmatic design—leveraging VxLAN, which is already widely used in VPCs, instead of requiring specialized hardware or protocols. This makes it deployable at scale. Another strength is its compatibility-first philosophy, which eases tenant adoption. However, I see potential challenges: (1) its RC QP quota control is still best-effort and may not fully guarantee isolation under noisy-neighbor scenarios, and (2) some hardware limitations (like PFC storms or GDR performance degradation) suggest Jakiro’s deployment complexity could be high.