Bandwidth and Latency Analysis of MPI Collectives

March 1, 2025

Admittedly, the design of MPI interface is the one of the most successful abstractions in computer science, where most interactions in HPC and AI can be fully described. In this article, I'm going to introduce common implementation algorithms for MPI collectives, and then briefly compare their bandwidth and latency.

Suppose both the in/out (i.e., bidirectional) bandwidth of a single node is BB, and the interconnection latency is ll.

References