Graph processing is data driven, and general purpose graph computation requires a high degree of random data access. Despite great progress in disk technology, it still cannot provide the level of random access that is required by graph computation. On the other hand, memory-based approaches usually do not scale well due to the capacity limit of a single machine. In this paper, we introduce Trinity, a general purpose graph engine over a distributed memory cloud. Trinity supports online query processing and offline analytics on large graphs. For online query processing, it leverages its fast graph exploration capability provided by the memory-based storage infrastructure. For offline graph analytics, it leverages the parallelism provided by the underlying scale-out distributed architecture. Furthermore, Trinity leverages graph access patterns in both online and offline computation to optimize memory and communication in order to deliver the best performance. We demonstrate with Trinity we can perform low latency graph queries as well as high throughput graph analytics on web-scale, billion-node graphs using just a few commodity machines.