This paper introduces a novel document annotation system that aims to enable the kinds of rich communication that usually only occur in face-to-face meetings. Our system, RichReview, lets users create annotations on top of digital documents using three main modalities: freeform inking, voice for narration, and deictic gestures in support of voice. RichReview uses novel visual representations and timesynchronization between modalities to simplify annotation access and navigation. Moreover, RichReview’s versatile support for multi-modal annotations enables users to mix and interweave different modalities in threaded conversations. A formative evaluation demonstrates early promise for the system finding support for voice, pointing, and the combination of both to be especially valuable. In addition, initial findings point to the ways in which both content and social context affect modality choice.