Rich text tasks are increasingly common on mobile devices, requiring the user to interleave typing and selection to produce the text and formatting she desires. However, mobile devices are a rich input space where input does not need to be limited to a keyboard and touch. In this paper, we present two complimentary studies evaluating four different input modalities to perform selection tasks in support of text entry on a mobile device. The modalities studied were: screen touch (Touch), device tilt (Tilt), voice recognition (Speech), and foot tap (Foot). The results show that Tilt is the fastest method for making a selection, but that Touch allows for the highest overall text throughput. The Tilt and Foot methods—although fast—resulted in users performing and subsequently correcting a high number of text entry errors, whereas the number of errors for Touch is significantly lower. Users experienced significant difficulty when using Tilt and Foot in coordinating the format selections in parallel with the text entry. This difficulty resulted in more errors and therefore lower text throughput. Touching the screen to perform a selection is slower than tilting the device or tapping the foot, but the action of moving the fingers off the keyboard to make a selection ensured high precision when interleaving selection and text entry. Additionally, our results highlight the importance of studying new input methods in the context of real user tasks.