Sports videos have special characteristics such as well-defined video structure, specialized sports syntax, and some canonical view types. In this paper, we proposed an online learning framework for sports video structure analysis, using baseball as an example. This framework, in which only a very small number of pre-labeled training samples are required at initial stage, employs an optimal local positive model by sufficiently exploring the local statistic characteristics of the current under-test videos. To avoid adaptive threshold selection, a set of negative models are incorporated with the local positive model during the classification procedure. Furthermore, the proposed framework is able to be applied to real time applications. Preliminary experimental results on a set of baseball videos demonstrate that the proposed system is effective and efficient.