Understanding the Host Network

songhaohao · July 30, 2024, 11:36am

Title: Understanding the Host Network

Authors: Midhul Vuppalapati, Saksham Agarwal (Cornell University), Henry Schuch, Baris Kasikci, Arvind Krishnamurthy (University of Washington), Rachit Agarwal (Cornell University)

Scribe: Haohao Song (Xiamen University, China)

Introduction
As data centers scale and technology evolve, the performance of peripheral interconnects has been improving at a much faster pace than that of processor and memory interconnects. This disparity leads to an increasing imbalance of resources within the host network, resulting in contention that negatively affects end-to-end application performance. Three factors have driven the need for this research: 1. Performance bottlenecks. The host network is emerging as a prominent bottleneck due to unfavorable technology trends. As peripheral devices become faster, the slower interconnects between processors and memory become a limiting factor, causing contention and impacting overall system performance. 2. Contention impact. Recent studies from production data centers have shown that contention within the host network can cause significant throughput degradation, increased tail latency, and isolation violations for networked applications. These performance penalties are not well understood and require deeper investigation. 3. Implications for design and architecture. A deeper understanding of the host network’s behavior under contention is essential for the design of future systems. This includes the development of protocols and architectures that can mitigate the negative impacts of contention, ensuring scalable and efficient data center operations.

Key idea and contribution:
The authors introduce a conceptual model called “domain-by-domain credit-based flow control” to study the host network. This model allows for the analysis of how flow control operates across different domains (sub-networks) within the host network, each having varying numbers of credits and latencies. The study reveals that different applications may experience different levels of performance degradation during contention due to the specific domains they traverse.

The paper presents a detailed examination of the host network’s impact on two types of applications: those generating peripheral-to-memory (P2M) traffic and those generating compute-to-memory (C2M) traffic. The authors use real-world applications like Redis (an in-memory database) and FIO (a storage benchmarking tool) to simulate these scenarios. They discover that C2M applications can suffer performance degradation with minimal impact on P2M applications, even when memory bandwidth is not saturated. This finding contradicts previous studies and uncovers new contention regimes within the host network.

The authors further explore the architecture of the host network, detailing the paths of C2M and P2M requests and the role of various components such as the Line Fill Buffer (LFB), Caching and Home Agent (CHA), Integrated IO controller (IIO), and Memory Controller (MC). They also discuss the flow control mechanisms at play in the peripheral and memory interconnects and how they contribute to contention.

Evaluation
The paper provides a quantitative analysis of the host network’s contention regimes, identifying two key regimes: the “blue regime,” where C2M applications experience performance degradation while P2M applications do not, and the “red regime,” where both C2M and P2M applications suffer performance degradation when memory bandwidth is saturated. This result is significant because the authors use an analytical formula to validate their findings, demonstrating that queueing delays at the memory controller and CHA are the primary contributors to latency inflation.

Q1: What do you think is the fundamental difference between the inter-host network flow control and intro-host flow control?
A1: When we think about interhost networks, two kinds of flow control mechanisms come to mind. The first is end-to-end flow control like TCP. The second is hub-by-hub flow control like PFC metworks. I think the thing that I described in host network is different because you can think of it as a generalization of these two kinds of flow control, like end to end to in hop by hop flow control, each hop is a single domain, whereas an end to end flow control is one big domain. It turns out that in the host network, the flow control is a little bit more nuanced.

Q2: It seems that a lot of problems are caused by the per-priority flow control. It handles unblocking, right? What other technologies can solve this problem?
A2: We do see multiple cues. So I think that may be a possible solution that if you could expose multiple cues for different kinds of traffic flows and so on, which might help you achieve better performance isolation. But I also think that’s challenging because hardware resources are very limited and it might be very hard to provide for application buffers at all of these nodes in the host networks.

Q3: The paper describes is it takes an approach of a memory-centric model. Can it be more generalized to anything else?
A3: I think the analysis would become much more complicated because here we just had two types of traffic. I think that’s a very facial, interesting facial direction to look at to see how we could extend this to those kinds of scenarios.

Q4: Do you see these trends holding for different sorts of access patterns like C2M maps or P2M? If the access pattern of these apps changes, do you see trends changing?
A4: I think we do see that the trends generalize across like random access and sequential access patterns, at least at those two extremes. I think the reason for that is, for example, if you have a random, the difference between a sequential and random access pattern from memory perspective is basically that you have better role locality and things like that in memory. For example, what we see with random access workloads is that some of these problems get worse because you have more processing overheads inside the DRAM and you can start seeing queuing even before your bandwidth is bottlenecked.

Personal thoughts
This paper studies the host network using domain-by-domain credit-based flow control. My interest is also focused on the memory-centric model of the research. Specifically, this paper analyzes the two modes of C2M and P2M. However, is this research method applicable in data centers dominated by AI loads? How will the conclusions of the research change? Looking forward to more discussions.