Base-Delta-Immediate Compression: Practical Data Compression for On-Chip Caches

  • Gennady Pekhimenko | Carnegie Mellon University

Cache compression is a promising technique to increase on-chip cache capacity and to decrease on-chip and off-chip bandwidth usage. Unfortunately, directly applying well-known compression algorithms (usually implemented in software) leads to high hardware
complexity and unacceptable decompression/compression latencies, which in turn can negatively affect performance. Hence, there is a need for a simple yet efficient compression technique that
can effectively compress common in-cache data patterns, and has minimal effect on cache access latency.

In this work, we introduce a new compression algorithm called Base-Delta-Immediate (B∆I) compression, a practical technique for compressing data in on-chip caches. The key idea is that, for many cache lines, the values within the cache line have a low dynamic range – i.e., the differences between values stored within the cache line are small. As a result, a cache line can be represented using a base value and an array of differences whose combined size is much smaller than the original cache line (we call this the base+delta encoding). Moreover, many cache lines intersperse such base+delta values with small values – our B∆I technique efficiently incorporates such immediate values into its encoding.

Compared to prior cache compression approaches, our studies show that B∆I strikes a sweet-spot in the tradeoff between compression ratio, decompression/compression latencies, and hardware complexity. Our results show that B∆I compression improves performance for both single-core (8.1% improvement) and multi-core workloads (9.5% / 11.2%) improvement for two/four cores). For many applications, B∆I provides the performance benefit of doubling the cache size of the baseline system, effectively increasing average cache capacity by 1.53X.

Speaker Details

Gennady Pekhimenko is a third year PhD student in the Computer Science Department at Carnegie Mellon University, working with Professor Todd C. Mowry and Professor Onur Mutlu. Before that (2008-2010), he worked at IBM Toronto Lab (Compilers Group) under supervision of Dr. Yaoqing Gao. He received his MSc. in Computer Science in 2008 from the University of Toronto, where he was advised by Professor Angela Demke Brown; and his B.S. in Applied Mathematics and Computer Science from Moscow State University in 2004. His research interests are in computer architecture with emphasis on efficient memory hierarchy designs. Gennady’s work is funded by Microsoft Research, Qualcomm Innovation and NSERC CGS-D fellowships.

    • Portrait of Jeff Running

      Jeff Running

Series: Microsoft Research Talks