An exabyte a day: throughput-oriented, large scale, managed data transfers with Effingo

Title : An exabyte a day: throughput-oriented, large scale, managed data transfers with Effingo
Authors : Ladislav Pápay, Jan Pustelnik (Google); Krzysztof Rzadca (Google and University of Warsaw); Beata Strack, Paweł Stradomski, Bartłomiej Wołowiec, Michal Zasadzinski (Google)
Scribe : Huan Shen (Xiamen University)

Introduction
The experience paper from Google is about data transfers in globally distributed systems. Existing systems and tools do not fully address the complexities of large-scale, production-grade throughput-oriented data transfer. The paper describes the data transfer management system Effingo, which has been used in production for years and widely adopted at Google, transferring daily over an exabyte in billions of files.

Key idea and contribution
Large-scale file copying operation is essential for globally distributed systems to minimize the end-user latency of data access, increase service reliability, recover from disaster, etc. Effingo is designed to deliver high-throughput transfers with an interface similar to SCP while optimizing network cost with minimal impact on data centers. The key ideas behind Effingo include copy tree optimization, dynamic adaptation to changing network conditions, and maintaining fairness between competing transfers. Effingo can sustain a scale of operations several times larger than previously reported.

Evaluation
The evaluation of Effingo is conducted through a series of experiments and observational studies within Google’s production infrastructure. The results demonstrate the system’s ability to efficiently manage resources, maintain fairness among competing transfers, and dynamically adjust to network conditions.

Q&A

Personal thoughts

The paper from Google shares the design and deployment experience of Effingo, a throughput-oriented file copy system. The system is significant for its exploration of a real-world problem and the successful deployment of a system addressing that problem at scale.