The Trinity Graph Engine

Bin Shao, Haixun Wang, Yatao Li

MSR-TR-2012-30 |

Computations performed by graph algorithms are data driven,
and require a high degree of random data access. Despite
the great progresses made in disk technology, it still cannot
provide the level of efficient random access required by graph
computation. On the other hand, memory-based approaches
usually do not scale due to the capacity limit of single machines.
In this paper, we introduce Trinity, a general purpose
graph engine over a distributed memory cloud. Through optimized
memory management and network communication,
Trinity supports fast graph exploration as well as efficient
parallel computing. In particular, Trinity leverages graph
access patterns in both online and offline computation to
optimize memory and communication for best performance.
These enable Trinity to support efficient online query processing
and offline analytics on large graphs with just a few
commodity machines. Furthermore, Trinity provides a high
level specification language called TSL for users to declare
data schema and communication protocols, which brings
great ease-of-use for general purpose graph management and
computing. Our experiments show Trinity’s performance in
both low latency graph queries as well as high throughput
graph analytics on web-scale, billion-node graphs.