Pattern recognition in audio, speech, and language processing requires estimation of parameters in statistical models via optimization criteria. Formulation of these optimization models is far from straightforward. For example, likelihood criteria usually are inadequate if the training data do not represent all possible variations in patterns. Signiﬁcant progress in pattern recognition has been achieved by introducing discrimination criteria for training, but overtraining remains a danger. An important formulation device is a regularization term in the optimization objective that captures the prior information available about parameter values and their relationships.