Visions of multimodal interaction with computers are as old as the field of HCI itself: by adding voice, gesture, gaze and other forms of input, the hope is that engaging with computers might be more efficient, expressive and natural. Yet it is only in the last decade that the dominance of multi-touch and the rise of gesture-based interaction are radically altering the ways we interact with computers. On the one hand these changes are inspirational and open up the design space; on the other hand, it has caused fractionation in interface design and added complexity for users. Many of these complexities are caused by layering new forms of input on top of existing systems and practices. I will discuss our own recent adventures in trying to design and implement these hybrid forms of input, and highlight the challenges and the opportunities for future input paradigms. In particular, I conclude that the acid test for any of these new techniques is testing in the wild, but we need to start with human-centred design principles.