In my Ph.D. thesis, I propose mechanisms to address the inefficiency caused by existing memory abstractions, which manage and access memory at different granularities at different resources. In this talk, I will focus my attention on our mechanisms to accelerate certain memory operations using the DRAM technology.
In today’s systems, DRAM is used only as a storage device. Off-chip DRAM interfaces allow the memory controller to read and write data. In this line of work, we observe that this model is very inefficient for certain key primitives in modern systems. And we ask the question, “can DRAM do more than just store data?” In response, we propose three techniques that exploit the DRAM architecture to improve the efficiency of three important operations. First, we propose RowClone, a mechanism to perform bulk copy and initialization (specifically zeroing) operations completely within DRAM. Second, we propose Gather-Scatter DRAM (GS-DRAM), a mechanism to improve the efficiency of non-unit strided access patterns, with specific focus on power-of-2 strides. Finally, we propose Buddy, a new substrate that exploits existing DRAM operation to perform bulk bitwise operations completely within DRAM. Our mechanisms enable an significant improvement in performance and energy-efficiency of the respective operations. In this talk, I will describe GS-DRAM and Buddy (the two recent works) in detail and briefly describe RowClone. I will then briefly summarize my other works and my plan for future research.