The ranking models of existing image search engines are generally based on associated text while the image visual content is actually neglected. Imperfect search results frequently appear due to the mismatch between the textual features and the actual image content. Visual reranking, in which visual information is applied to refine text based search results, has been proven to be effective. However, the improvement brought by visual reranking is limited, and the main reason is that the errors in the text-based results will propagate to the refinement stage. In this paper, we propose a Content-Aware Ranking model based on “learning to rank” framework, in which textual and visual information are simultaneously leveraged in the ranking learning process. We formulate the Content-Aware Ranking learning based on large margin structured output learning, by modeling the visual information into a regularization term. The direct optimization of the learning problem is nearly infeasible since the number of constraints is huge. The efficient cutting plane algorithm is adopted to learn the model by iteratively adding the most violated constraints. Extensive experimental results on a large-scale dataset collected from a commercial Web image search engine demonstrate that the proposed ranking model significantly outperforms the state-of-the-art ranking and reranking methods.