Applications of human action recognition in interactive systems such as games require the robust real-time recognition of human actions at low latencies from a stream of observations. The current paradigms of action recognition either treat the pre-segmented sequence as a whole unit to be classified, or classify a range of frames as action, evaluating the performance using a frame-by-frame measure. We argue that both paradigms are limited when addressing latency requirements. Instead, we propose the notion of “action points” to serve as natural temporal anchors of simple human actions. Action points enable latency-aware training and evaluation of online recognition systems. To demonstrate the usefulness of action points we show how two different systems, a Hidden Markov Model and a direct classification approach can be used with action point annotations. We evaluate our approach on two data sets with different input modalities and show that our abstraction of action points is useful in settings where human action recognition has to be performed online and at low latencies.