Parsers have been shown to be helpful in information retrieval tasks because they are able to model long-span word dependencies efficiently. While previous work focused on using traditional syntactic parse trees, this paper proposes a new approach where, unlike previous work, the parser parameters are discriminatively trained to directly optimize a non-convex and non-smooth IR measure. The relevance between a document and a query is then modeled by the weighted tree edit distance between their parses. We evaluate our method on a large scale web search task consisting of a real world query set. Results show that the new parser is more effective for document retrieval than using traditional syntactic parse trees. It gives significant improvement, especially for long queries where proper modeling of long-span dependencies is crucial.