This paper presents two vision-based interface systems. The first, Visual Screen, uses an inexpensive technique to transform an ordinary screen into a touch screen using an ordinary camera. The setup is easy: position a camera so it can see the whole screen. The system calibration involves the detection of the screen region in the image, which determines the projective mapping between the image plane and the screen. In runtime, our system locates the tip pointer (fingertip in our current implementation) in the image and converts the image position to the cursor position on the screen. The second system, Visual Panel, extends the previous system for mobile applications. It employs an arbitrary quadrangle-shaped panel (e.g., an ordinary piece of paper) and a tip pointer as an intuitive, wireless and mobile input device. The system can accurately and reliably track the panel and the tip pointer. The panel tracking continuously determines the projective mapping between the panel at the current position and the display, which in turn maps the tip position to the corresponding position on the display. The system can fulfill many tasks such as controlling a remote large display, and simulating a physical keyboard. Users can naturally use their fingers or other tip pointers to issue commands and type texts. Furthermore, by tracking the 3D position and orientation of the visual panel, the system can also provide 3D information, serving as a virtual joystick, to control 3D virtual objects