CONTENT:
-------
This folder contains the following files:
>> README.txt
>> MSR-LA_24Jun2013.pdf (the license agreement for using this data)
>> 14 data files in JSON format.

LICENSE AGREEMENT:
-----------------
This dataset can be used for non-commercial research purposes. By downloading this data or using it in any form, you agree to be bound by the terms of the MSR-LA. Please read the license agreement provided in <MSR-LA_24Jun2013.pdf>.  If you do not agree, do not install, copy or use the Data. The Data is protected by copyright and other intellectual property laws and is licensed, not sold. 

DATA FILES:
----------
There are 14 datasets which contains the annotation in JSON format. Please go through the supplementary material (http://research.microsoft.com/apps/pubs/default.aspx?id=192002) and the following papers for details of how the data is formatted and how the annotations were created.

[1] Rohan Ramanath, Monojit Choudhury, Kalika Bali and Rishiraj Saha Roy, Crowd Prefers the Middle Path: Why Crowdsourcing is Un?t for Query Segmentation, in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 13), So?a, Bulgaria, 4  9 August 2013. 

[2] Rohan Ramanath, Monojit Choudhury and Kalika Bali, Entailment: An Effective Metric for Comparing and Evaluating Hierarchical and Non-hierarchical Annotation Schemes, in Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse (LAW VII), So?a, Bulgaria, 8  9 August 2013. 


CONTACT:
--------
In case of any doubt, please contact: <monojitc@microsoft.com> or any other authors of the paper.