The recently developed depth sensors, e.g., the Kinect sensor, have provided new opportunities for human-computer interaction (HCI). Although great progress has been made by leveraging the Kinect sensor, e.g. in human body tracking and body gesture recognition, robust hand gesture recognition remains an open problem. Compared to the entire human body, the hand is a smaller object with more complex articulations and more easily affected by segmentation errors. It is thus a very challenging problem to recognize hand gestures. This paper focuses on building a robust hand gesture recognition system using the Kinect sensor. To handle the noisy hand shape obtained from the Kinect sensor, we propose a novel distance metric for hand dissimilarity measure, called Finger-Earth Mover’s Distance (FEMD). As it only matches fingers while not the whole hand shape, it can better distinguish hand gestures of slight differences. The extensive experiments demonstrate the accuracy, efficiency, and robustness of our hand gesture recognition system.