This folder contains the data of questions and answers used in our paper "Web-based Question Answering: Revisiting AskMSR".  The output files of variations of our systems, as well as the evaluation script, are also included.  Below is a short description of these files.  More detail can be found in the paper.

data/
  trec.train -- 1,751 questions used for training
  trec.test -- 380 questions used for testing
  patterns.train -- answer patterns for training questions
  patterns.test -- answer patterns for testing questions
  trec.train.snippets.csv -- search snippets returned by Bing when issuing training questions as queries
  trec.test.snippets.csv -- search snippets returned by Bing when issuing testing questions as queries

output/
  trec.test.output.1 -- system "AskMSR" output on testing questions (Table 3, Row #1)
  trec.test.output.2 -- system "Direct Query" output on testing questions (Table 3, Row #2)
  trec.test.output.3 -- system "Direct Query + Question Type" output on testing questions (Table 3, Row #3)
  trec.test.output.4 -- system "WordNet Beginners Matching" output on testing questions (Table 3, Row #4)
  trec.test.output.5 -- system "Wikipeida Introduction Matching" output on testing questions (Table 3, Row #5)
  trec.test.output.6 -- system "AskMSR+" output on testing questions (Table 3, Row #6)
  
eval.pl -- evaluation script. For example:
  $ ./eval.pl data/patterns.test output/trec.test.output.6
  Mean reciprocal rank over 380 questions is 0.6211
  98 questions had no answers found in top 5 responses.
  1 questions had no answers returned

ReadMe.txt -- this file


If you use this dataset, please cite the following paper:

@TechReport {TsaiYihBurges2015,
author       = {Chen-Tse Tsai and Wen-tau Yih and Christopher J.C. Burges},
month        = {April},
number       = {MSR-TR-2015-20},
title        = {{Web}-based Question Answering: Revisiting {AskMSR}},
url          = {http://research.microsoft.com/apps/pubs/default.aspx?id=241143},
year         = {2015},
}

Please contact Chen-Tse Tsai <ctse.tsai@gmail.com> or Scott Wen-tau Yih <scottyih@microsoft.com> if you have any question.