Foreword.. . . . . . . . . . . . . . . . xv
Preface.. . . . . . . . . . . . . . . . . xvii
Acknowledgments.. . . . . . . . . . . . . . xix
About the Authors.. . . . . . . . . . . . . . xxi
1 Wonders in the Workload. . . . . . . . . . . . 1
What’s New in AI Data Center Workloads.. . . . . . . . 1
The Life Cycle of an AI Model.. . . . . . . . . . . 2
Training an AI Model. . . . . . . . . . . . 3
Parallelism. . . . . . . . . . . . . . 4
Job Completion Time (JCT). . . . . . . . . . . 6
Tail Latency.. . . . . . . . . . . . . . 7
Summary. . . . . . . . . . . . . . 16
Test Your Knowledge. . . . . . . . . . . . 17
2 “The Common-Man View” of AI Data Center Fabrics.. . . . . 19
Training vs. Inference AI Data Centers. . . . . . . . . 19
InfiniBand vs. Ethernet for AI Training Data Centers.. . . . . . 21
Ethernet Hardware Switches and Advanced Software Features.. . . . 22
Handling Elephant Flows.. . . . . . . . . . . 24
Load-Balancing Techniques. . . . . . . . . . . 25
Congestion Management and Mitigation Techniques.. . . . . . 26
Summary. . . . . . . . . . . . . . 28
Test Your Knowledge. . . . . . . . . . . . 29
3 Network Design Considerations. . . . . . . . . . 31
Background Introduction.. . . . . . . . . . . 31
Training Data Center Architecture. . . . . . . . . . 33
Rail-Optimized Design (ROD).. . . . . . . . . . 34
Rail-Unified Design (RUD).. . . . . . . . . . . 42
Rack Design. . . . . . . . . . . . . . 45
Scheduled Fabric. . . . . . . . . . . . . 49
Topologies. . . . . . . . . . . . . . 50
Inference Data Center Architecture. . . . . . . . . 56
Multi-Planar Scale-Out Architectures.. . . . . . . . . 56
Summary. . . . . . . . . . . . . . 63
Test Your Knowledge. . . . . . . . . . . . 64
References. . . . . . . . . . . . . . 66
4 Optics and Cable Management.. . . . . . . . . . 67
Scaling Optics for AI Clusters.. . . . . . . . . . 67
Challenges in Optical Innovation.. . . . . . . . . . 70
Packet Flow. . . . . . . . . . . . . . 70
Transmission Modes.. . . . . . . . . . . . 73
Transceiver Types.. . . . . . . . . . . . . 76
Cable and Connector Types. . . . . . . . . . . 78
Standards.. . . . . . . . . . . . . . 79
Further Innovations in Optics.. . . . . . . . . . 82
Summary. . . . . . . . . . . . . . 83
Test Your Knowledge. . . . . . . . . . . . 85
References. . . . . . . . . . . . . . 86
5 Thermal and Power Efficiency Considerations. . . . . . . 87
Thermal Footprints in AI Data Centers.. . . . . . . . . 87
Airflow Options. . . . . . . . . . . . . 88
Liquid Cooling. . . . . . . . . . . . . 89
Summary. . . . . . . . . . . . . . 93
Test Your Knowledge. . . . . . . . . . . . 94
References. . . . . . . . . . . . . . 95
6 Efficient Load Balancing. . . . . . . . . . . . 97
Per-Flow Load Balancing. . . . . . . . . . . 99
Per-Packet Load Balancing.. . . . . . . . . . . 115
Load-Balancing Mechanism Comparison.. . . . . . . . 117
Summary. . . . . . . . . . . . . . 118
Test Your Knowledge. . . . . . . . . . . . 119
7 RoCEv2 Transport and Congestion Management.. . . . . . 123
Congestion Points. . . . . . . . . . . . 123
Explicit Congestion Notification (ECN).. . . . . . . . 127
Data Center Quantized Congestion Notification (DCQCN).. . . . . 134
Source Flow Control (SFC). . . . . . . . . . . 136
Congestion Signaling.. . . . . . . . . . . . 137
Summary. . . . . . . . . . . . . . 139
Test Your Knowledge. . . . . . . . . . . . 140
8 IP Routing for AI/ML Fabrics.. . . . . . . . . . 143
Dynamic IP Routing Options. . . . . . . . . . 144
eBGP Underlay for Three-Stage/Five-Stage Fabric for an AI Data Center.. . 145
Multi-tenancy for an AI/ML Cluster Data Center Network. . . . . 171
Microsegmentation and Multi-tenancy for an AI/ML Data Center.. . . 177
Extending IP Routing to the Server. . . . . . . . . 177
Traffic Engineering in the AI Data Center Fabric.. . . . . . . 178
Segment Routing and SRv6 for AI/ML Fabrics. . . . . . . 179
Summary. . . . . . . . . . . . . . 184
Test Your Knowledge. . . . . . . . . . . . 185
References. . . . . . . . . . . . . . 187
9 Storage Network Design and Technologies.. . . . . . . 189
The AI Data Center Life Cycle and Storage Networks.. . . . . . 191
Storage Network Design Types. . . . . . . . . . 193
Block, Object, and File Storage Systems.. . . . . . . . 198
NVMe-oF for Block-Level Access.. . . . . . . . . . 199
NVMe-o-RDMA/RoCEv2 State Machine. . . . . . . . 206
High-Performance File Systems. . . . . . . . . . 208
GPUDirect Storage.. . . . . . . . . . . . 211
Summary. . . . . . . . . . . . . . 217
Test Your Knowledge. . . . . . . . . . . . 218
References. . . . . . . . . . . . . . 219
10 AI Network Performance KPIs. . . . . . . . . . 221
Significance of Performance Benchmarking. . . . . . . 221
MLCommons for AI Data Centers.. . . . . . . . . 223
MLCommons Initiatives. . . . . . . . . . . 224
MLCommons Benchmarking Suites.. . . . . . . . . 224
Benchmarking a Data Center for Machine Learning. . . . . . 225
Summary. . . . . . . . . . . . . . 226
Test Your Knowledge. . . . . . . . . . . . 227
References. . . . . . . . . . . . . . 228
11 Monitoring and Telemetry.. . . . . . . . . . . 229
Exploring Monitoring Options.. . . . . . . . . . 229
Network Monitoring in an AI/ML Data Center Network.. . . . . 231
In-Band Flow Analyzer (IFA). . . . . . . . . . . 234
Corrective Actions. . . . . . . . . . . . 237
Summary. . . . . . . . . . . . . . 238
Reference.. . . . . . . . . . . . . . 238
12 Ultra Ethernet Consortium (UEC). . . . . . . . . 239
UEC Developments and Working Groups.. . . . . . . . 241
UEC Key Terminology.. . . . . . . . . . . . 244
The UEC and Network Architectures. . . . . . . . . 246
A New Protocol Stack.. . . . . . . . . . . . 247
Data Plan: Packet Forwarding Options.. . . . . . . . 252
Packet Delivery Modes.. . . . . . . . . . . 257
Congestion Management (CM) in the UEC Specification.. . . . . 261
Packet Trimming and Fast Retransmissions. . . . . . . . 264
Link Layer Reliability (LLR) Mechanism.. . . . . . . . 265
In-Network Collectives (INC) and xCCL.. . . . . . . . 266
Management and Orchestration. . . . . . . . . . 268
Interoperability and Backward Compatibility.. . . . . . . 269
Compliance and Certification.. . . . . . . . . . 269
UEC Challenges and Future Directions.. . . . . . . . 269
Comparing UEC to InfiniBand and RoCEv2. . . . . . . . 270
Summary. . . . . . . . . . . . . . 271
Test Your Knowledge. . . . . . . . . . . . 272
References. . . . . . . . . . . . . . 273
13 Scale-Up Systems.. . . . . . . . . . . . . 275
Key Building Blocks of Scale-Up Systems.. . . . . . . . 278
Scale-Up Ethernet Transport (SUE-T). . . . . . . . . 281
Ultra Accelerator Link (UALink).. . . . . . . . . . 286
Memory Coherence in Scale-Up Systems.. . . . . . . . 291
Scale-Up Systems: Key Differences and Similarities.. . . . . . 292
Summary. . . . . . . . . . . . . . 294
Test Your Knowledge. . . . . . . . . . . . 295
References. . . . . . . . . . . . . . 297
14 Conclusion.. . . . . . . . . . . . . . 299
DC Network Role for AI.. . . . . . . . . . . 299
Caveats and Challenges.. . . . . . . . . . . 300
Future Developments.. . . . . . . . . . . . 302
Final Remarks.. . . . . . . . . . . . . 304
References. . . . . . . . . . . . . . 305
Appendix A Questions and Answers.. . . . . . . . . . 307
Appendix B Acronyms.. . . . . . . . . . . . . 329
9780135436288, TOC, 1/8/2026