Data Set of English-Spanish Term Vectors from Wikipedia

Language:
English
This data set consists of the term vectors extracted from 60,730 Wikipedia English articles and their comparable Spanish articles, sampled in 2009. Last published: August 8, 2011.
  • Version:

    1.0.0

    File Name:

    EN-ES_Wiki.zip

    Date Published:

    5/11/2016

    File Size:

    218.4 MB

      This data set consists of the term vectors extracted from 60,730 Wikipedia English articles and their comparable Spanish articles, sampled in 2009. We used this data set to test various models for creating translingual document representations, work published in [Platt et al. EMNLP-2010] and [Yih et al. CoNLL-2011]. More detail of this data set can be found in the ReadMe file.
  • Supported Operating System

    Windows 10 , Windows 7, Windows 8

      • Windows 7, Windows 8, or Windows 10
      • Click Download and follow the instructions.
Site feedback
Microsoft

What category would you like to give web site feedback on?



Rate your level of satisfaction with this web page today:

Comments:

Submit