Systematic CXL Memory Characterization and Performance Analysis at Scale
- Jinshu Liu ,
- Hamid Hadian ,
- Yuyue Wang ,
- Daniel S. Berger ,
- Marie Nguyen ,
- Xun Jian ,
- Sam H. Noh ,
- Huaicheng Li
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 |
Compute Express Link (CXL) has emerged as a pivotal interconnect for memory expansion. Despite its potential, the performance implications of CXL across devices, latency regimes, processors, and workloads remain underexplored. We present Melody, a framework for systematic characterization and analysis of CXL memory performance. Melody builds on an extensive evaluation spanning 265 workloads, 4 real CXL devices, 7 latency levels, and 5 CPU platforms. Melody yields many insights: workload sensitivity to sub-μs CXL latencies (140-410ns), the first disclosure of CXL tail latencies, CPU tolerance to CXL latencies, a novel approach (SPA) for pinpointing CXL bottlenecks, and CPU prefetcher inefficiencies under CXL.