In this paper, we investigate the relative merits between GPGPUs and multicores in the context of sparse matrix-vector multiplication (SpMV). While GPGPUs possess impressive capabilities in terms of raw compute throughput and memory bandwidth, their performance varies significantly with application tuning as well as sparse input and format characteristics. Furthermore, several emerging technological and workload trends potentially shift the balance in favor towards the multicore, especially in the context of bandwidth-limited workloads. In light of these trends, our paper investigates the performance, power, and cost of state-of-the-art GPGPUs and multicores across a spectrum of sparse inputs. Our results reinforce some current-day understandings of GPGPUs and multicores while offering new insights on their relative merits in the
context of future systems.