Models such as pairwise conditional random fields (CRFs) are extremely popular in computer vision and various other machine learning disciplines. However, they have limited expressive power and often cannot represent the posterior distribution correctly. While learning the parameters of such models which have insufficient expressivity, researchers use loss functions to penalize certain misrepresentations of the solution space. Till now, researchers have used only simplistic loss functions such as the Hamming loss, to enable efficient inference. The paper shows how sophisticated and useful higher order loss functions can be incorporated in the learning process. These loss functions ensure that the MAP solution does not deviate much from the ground truth in terms of certain higher order statistics. We propose a learning algorithm which uses the recently proposed lowerenvelop representation of higher order functions to transform them to pairwise functions, which allow efficient inference. We test the efficacy of our method on the problem of foreground-background image segmentation. Experimental results show that the incorporation of higher order loss functions in the learning formulation using our method leads to much better results compared to those obtained by using the traditional Hamming loss.