{"id":703531,"date":"2020-12-15T14:29:28","date_gmt":"2020-12-15T22:29:28","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=703531"},"modified":"2020-12-15T14:29:28","modified_gmt":"2020-12-15T22:29:28","slug":"open-academic-graph","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/open-academic-graph\/","title":{"rendered":"Open Academic Graph"},"content":{"rendered":"<header class=\"main entry-header left\"><\/header>\n<article id=\"post-82\" class=\"post-82 page type-page status-publish hentry\">\n<div class=\"body-wrap\">\n<div class=\"entry-content\">\n<div>\n<p>Open Academic Graph (OAG) is a large knowledge graph unifying two billion-scale academic graphs:\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academic.microsoft.com\/\">Microsoft Academic Graph<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0(MAG) and\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/www.aminer.cn\/\">AMiner<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. In mid 2017, we published OAG v1, which contains 166,192,182 papers from MAG and 154,771,162 papers from AMiner (see below) and generated 64,639,608 linking (matching) relations between the two graphs. This time, in OAG v2, author, venue and newer publication data and the corresponding matchings are available.<\/p>\n<h2>Overview of OAG v2<\/h2>\n<p>The statistics of OAG v2 is listed as the three tables below. The two large graphs are both evolving and we take MAG November 2018 snapshot and AMiner July 2018 or January 2019 snapshot for this version.<\/p>\n<h4>Table 1: Statistics of OAG venue data<\/h4>\n<table width=\"50%\">\n<tbody>\n<tr>\n<td>\n<div align=\"center\"><strong>Data Set<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>#Pairs\/Venues<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Date<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">Linking relations<\/div>\n<\/td>\n<td>\n<div align=\"center\">29,841<\/div>\n<\/td>\n<td>\n<div align=\"center\">2018.12<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">AMiner venues<\/div>\n<\/td>\n<td>\n<div align=\"center\">69,397<\/div>\n<\/td>\n<td>\n<div align=\"center\">2018.07<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">MAG venues<\/div>\n<\/td>\n<td>\n<div align=\"center\">52,678<\/div>\n<\/td>\n<td>\n<div align=\"center\">2018.11<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<h4>Table 2: Statistics of OAG paper data<\/h4>\n<table style=\"border-spacing: inherit;border-collapse: collapse\" width=\"50%\">\n<tbody>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>Data Set<\/strong><\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>#Pairs\/Papers<\/strong><\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>Date<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">Linking relations<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">91,137,597<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">2018.12<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">AMiner papers<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">172,209,563<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">2019.01<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">MAG papers<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">208,915,369<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">2018.11<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<h4>Table 3: Statistics of OAG author data<\/h4>\n<table style=\"border-spacing: inherit;border-collapse: collapse\" width=\"50%\">\n<tbody>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>Data Set<\/strong><\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>#Pairs\/Authors<\/strong><\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>Date<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">Linking relations<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">1,717,680<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">2019.01<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">AMiner authors<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">113,171,945<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">2018.07<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">MAG authors<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">253,144,301<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">2018.11<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<h2>Downloads<\/h2>\n<ul>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/linkage\/venue_linking_pairs.zip\">venue_linking_pairs.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/linkage\/paper_linking_pairs.zip\">paper_linking_pairs.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/linkage\/author_linking_pairs.zip\">author_linking_pairs.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/aminer\/venue\/aminer_venues.zip\">aminer_venues.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/mag\/venue\/mag_venues.zip\">mag_venues.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/mag\/paper\/mag_papers_0.zip\">mag_papers_0.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/mag\/paper\/mag_papers_1.zip\">mag_papers_1.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/mag\/paper\/mag_papers_2.zip\">mag_papers_2.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/aminer\/paper\/aminer_papers_0.zip\">aminer_papers_0.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/aminer\/paper\/aminer_papers_1.zip\">aminer_papers_1.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/aminer\/paper\/aminer_papers_2.zip\">aminer_papers_2.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/aminer\/paper\/aminer_papers_3.zip\">aminer_papers_3.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/mag\/author\/mag_authors_0.zip\">mag_authors_0.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/mag\/author\/mag_authors_1.zip\">mag_authors_1.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/mag\/author\/mag_authors_2.zip\">mag_authors_2.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/aminer\/author\/aminer_authors_0.zip\">aminer_authors_0.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/aminer\/author\/aminer_authors_1.zip\">aminer_authors_1.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/aminer\/author\/aminer_authors_2.zip\">aminer_authors_2.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag\/aminer\/author\/aminer_authors_3.zip\">aminer_authors_3.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<\/ul>\n<p>Please note that for author matching, we only consider authors whose paper count is not less than 5. After filtering those authors with small paper count, there are 6,855,193 authors in AMiner and 13,173,936 authors in MAG.<\/p>\n<h3>Data description<\/h3>\n<p>For linking relations, each pair is an \u201cID to ID\u201d pair. More specifically, its JSON schema is:<br \/>\n<code><br \/>\n{<br \/>\n\u201cmid\u201d: \u201cxxxx\u201d,<br \/>\n\u201caid\u201d: \u201cyyyy\u201d<br \/>\n}<br \/>\n<\/code><\/p>\n<p>where \u201cmid\u201d is MAG entity ID and \u201caid\u201d is AMiner entity ID.<\/p>\n<p>Other entity attributes are also provided, which can be used to do different types of research. The data schemas of venues, papers and authors are described below.<\/p>\n<h4>Table 4: Venue schema<\/h4>\n<table width=\"50%\">\n<tbody>\n<tr>\n<td>\n<div align=\"center\"><strong>Field Name<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Field Type<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Description<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Example<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">id<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">venue id<\/div>\n<\/td>\n<td>\n<div align=\"center\">5bf574641c5a1dcdd96f817b<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">JournalId<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">journal id<\/div>\n<\/td>\n<td>\n<div align=\"center\">137773608<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">ConferenceId<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">conference id<\/div>\n<\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">DisplayName<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">venue name<\/div>\n<\/td>\n<td>\n<div align=\"center\">Nature<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">NormalizedName<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">normalized venue name<\/div>\n<\/td>\n<td>\n<div align=\"center\">nature<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<h4>Table 5: Paper schema<\/h4>\n<table width=\"50%\">\n<tbody>\n<tr>\n<td>\n<div align=\"center\"><strong>Field Name<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Field Type<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Description<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Example<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">id<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">paper ID<\/div>\n<\/td>\n<td>\n<div align=\"center\">53e9ab9eb7602d970354a97e<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">title<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">paper title<\/div>\n<\/td>\n<td>\n<div align=\"center\">Data mining: concepts and techniques<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">authors.name<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">author name<\/div>\n<\/td>\n<td>\n<div align=\"center\">Jiawei Han<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">author.org<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">author affiliation<\/div>\n<\/td>\n<td>\n<div align=\"center\">Department of Computer Science, University of Illinois at Urbana-Champaign<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">author.id<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">author ID<\/div>\n<\/td>\n<td>\n<div align=\"center\">53f42f36dabfaedce54dcd0c<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">venue.id<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">paper venue ID<\/div>\n<\/td>\n<td>\n<div align=\"center\">53e17f5b20f7dfbc07e8ac6e<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">venue.raw<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">paper venue name<\/div>\n<\/td>\n<td>\n<div align=\"center\">Inteligencia Artificial, Revista Iberoamericana de Inteligencia Artificial<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">year<\/div>\n<\/td>\n<td>\n<div align=\"center\">int<\/div>\n<\/td>\n<td>\n<div align=\"center\">published year<\/div>\n<\/td>\n<td>\n<div align=\"center\">2000<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">keywords<\/div>\n<\/td>\n<td>\n<div align=\"center\">list of strings<\/div>\n<\/td>\n<td>\n<div align=\"center\">keywords<\/div>\n<\/td>\n<td>\n<div align=\"center\">[\u201cdata mining\u201d, \u201cstructured data\u201d, \u201cworld wide web\u201d, \u201csocial network\u201d, \u201crelational data\u201d]<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">n_citation<\/div>\n<\/td>\n<td>\n<div align=\"center\">int<\/div>\n<\/td>\n<td>\n<div align=\"center\">citation number<\/div>\n<\/td>\n<td>\n<div align=\"center\">40829<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">page_start<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">page start<\/div>\n<\/td>\n<td>\n<div align=\"center\">11<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">page_end<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">page end<\/div>\n<\/td>\n<td>\n<div align=\"center\">18<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">doc_type<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">paper type: journal, book title\u2026<\/div>\n<\/td>\n<td>\n<div align=\"center\">book<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">lang<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">detected language<\/div>\n<\/td>\n<td>\n<div align=\"center\">en<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">publisher<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">publisher<\/div>\n<\/td>\n<td>\n<div align=\"center\">Elsevier<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">volume<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">volume<\/div>\n<\/td>\n<td>\n<div align=\"center\">10<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">issue<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">issue<\/div>\n<\/td>\n<td>\n<div align=\"center\">29<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">issn<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">issn<\/div>\n<\/td>\n<td>\n<div align=\"center\">0020-7136<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">isbn<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">isbn<\/div>\n<\/td>\n<td>\n<div align=\"center\">1-55860-489-8<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">doi<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">doi<\/div>\n<\/td>\n<td>\n<div align=\"center\">10.4114\/ia.v10i29.873<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">pdf<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">pdf URL<\/div>\n<\/td>\n<td>\n<div align=\"center\">\/\/static.aminer.org\/upload\/pdf\/1254\/ 370\/239\/53e9ab9eb7602d970354a97e.pdf<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">url<\/div>\n<\/td>\n<td>\n<div align=\"center\">list<\/div>\n<\/td>\n<td>\n<div align=\"center\">external links<\/div>\n<\/td>\n<td>\n<div align=\"center\">[\u201chttp:\/\/dx.doi.org\/10.4114\/ia.v10i29.873\u201d, \u201chttp:\/\/polar.lsi.uned.es\/revista\/index.php\/ia\/ article\/view\/479\u201d]<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">abstract<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">abstract<\/div>\n<\/td>\n<td>\n<div align=\"center\">Our ability to generate\u2026<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>(Note: MAG paper attributes are incomplete. If you would like more attributes, please refer to this\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/docs.microsoft.com\/en-us\/academic-services\/graph\/get-started-setup-provisioning\">link<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0to get MAG on Azure.)<\/p>\n<h4>Table 6: Author schema<\/h4>\n<table width=\"50%\">\n<tbody>\n<tr>\n<td>\n<div align=\"center\"><strong>Field Name<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Field Type<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Description<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Example<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">id<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">author id<\/div>\n<\/td>\n<td>\n<div align=\"center\">53f42f36dabfaedce54dcd0c<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">name<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">author name<\/div>\n<\/td>\n<td>\n<div align=\"center\">Jiawei Han<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">normalized_name<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">normalized author name<\/div>\n<\/td>\n<td>\n<div align=\"center\">jiawei han<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">orgs<\/div>\n<\/td>\n<td>\n<div align=\"center\">list of strings<\/div>\n<\/td>\n<td>\n<div align=\"center\">author affiliations<\/div>\n<\/td>\n<td>\n<div align=\"center\">[\u201cDepartment of Computer Science, University of Illinois at Urbana-Champaign\u201d]<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">org<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">last known affiliation<\/div>\n<\/td>\n<td>\n<div align=\"center\">Department of Computer Science, University of Illinois at Urbana-Champaign<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">position<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">author position<\/div>\n<\/td>\n<td>\n<div align=\"center\">professor<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">n_pubs<\/div>\n<\/td>\n<td>\n<div align=\"center\">int<\/div>\n<\/td>\n<td>\n<div align=\"center\">the number of author publications<\/div>\n<\/td>\n<td>\n<div align=\"center\">1217<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">n_citation<\/div>\n<\/td>\n<td>\n<div align=\"center\">int<\/div>\n<\/td>\n<td>\n<div align=\"center\">author citation count<\/div>\n<\/td>\n<td>\n<div align=\"center\">191526<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">h_index<\/div>\n<\/td>\n<td>\n<div align=\"center\">int<\/div>\n<\/td>\n<td>\n<div align=\"center\">author h-index<\/div>\n<\/td>\n<td>\n<div align=\"center\">175<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">tags.t<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">research interests<\/div>\n<\/td>\n<td>\n<div align=\"center\">\u201cdata mining\u201d<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">tags.w<\/div>\n<\/td>\n<td>\n<div align=\"center\">int<\/div>\n<\/td>\n<td>\n<div align=\"center\">weight of interests<\/div>\n<\/td>\n<td>\n<div align=\"center\">243<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">pubs.i<\/div>\n<\/td>\n<td>\n<div align=\"center\">string<\/div>\n<\/td>\n<td>\n<div align=\"center\">author paper id<\/div>\n<\/td>\n<td>\n<div align=\"center\">53e9b9fbb7602d97045f7bb8<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">pubs.r<\/div>\n<\/td>\n<td>\n<div align=\"center\">int<\/div>\n<\/td>\n<td>\n<div align=\"center\">author order in the paper<\/div>\n<\/td>\n<td>\n<div align=\"center\">0<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Evaluation<\/h3>\n<p>We evaluated a small subset of matchings (around one thousand venue\/paper\/author pairs). The estimated accuracy is shown in Table 7.<\/p>\n<h4>Table 7: Accuracy of entity matching<\/h4>\n<table width=\"50%\">\n<tbody>\n<tr>\n<td>\n<div align=\"center\"><strong>Entity type<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Venue<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Paper (newly matched)<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Author<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">accuracy<\/div>\n<\/td>\n<td>\n<div align=\"center\">99.26%<\/div>\n<\/td>\n<td>\n<div align=\"center\">99.10%<\/div>\n<\/td>\n<td>\n<div align=\"center\">97.41%<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>Please continue to use the\u00a0<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/groups.google.com\/forum\/#!forum\/open-academic-graph\">OAG group<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0to discuss OAG related problems. If you have any issue or want to contribute to Open Academic Graph, feel free to share your ideas and thoughts at OAG Group.<\/p>\n<div>\n<h2>Overview of OAG v1<\/h2>\n<p>This data set is generated by\u00a0linking two large academic graphs:\u00a0<strong><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/academic.microsoft.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Microsoft Academic Graph<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>\u00a0(MAG)<\/strong>\u00a0and\u00a0<strong><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/aminer.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">AMiner<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/strong>, and it\u00a0is used for research purpose only.\u00a0This\u00a0version\u00a0includes\u00a0<strong>166,192,182<\/strong>\u00a0papers from MAG and\u00a0<strong>154,771,162<\/strong>\u00a0papers from AMiner. We generated\u00a0<strong>64,639,608<\/strong>\u00a0linking (matching) relations between the two graphs.\u00a0In the future, more linking results, like authors, will be published. It can be used as a unified large academic graph for studying citation network, paper content, and others, and can be also used to study integration of multiple academic graphs.<\/p>\n<p>The overall data set includes three\u00a0parts, which are\u00a0described in the table below:<\/p>\n<table width=\"50%\">\n<tbody>\n<tr>\n<td>\n<div align=\"center\"><strong>\u00a0Data Set<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>\u00a0#Paper<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>\u00a0#File<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>Total Size<\/strong><\/div>\n<\/td>\n<td>\n<div align=\"center\"><strong>\u00a0Date<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\"><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/linking\/linking_relations.txt.zip\">Linking relations\u00a0<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>(matching)<\/div>\n<\/td>\n<td>\n<div align=\"center\">64,639,608<\/div>\n<\/td>\n<td>\n<div align=\"center\">\u00a01<\/div>\n<\/td>\n<td>\n<div align=\"center\">1.6GB<\/div>\n<\/td>\n<td>\n<div align=\"center\">2017-06-22<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">MAG papers<\/div>\n<\/td>\n<td>\n<div align=\"center\">166,192,182<\/div>\n<\/td>\n<td>\n<div align=\"center\">9<\/div>\n<\/td>\n<td>\n<div align=\"center\">104GB<\/div>\n<\/td>\n<td>\n<div align=\"center\">2017-06-09<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<div align=\"center\">AMiner papers<\/div>\n<\/td>\n<td>\n<div align=\"center\">\u00a0154,771,162<\/div>\n<\/td>\n<td>\n<div align=\"center\">3<\/div>\n<\/td>\n<td>\n<div align=\"center\">39GB<\/div>\n<\/td>\n<td>\n<div align=\"center\">2017-03-22<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>&nbsp;<\/p>\n<h3>Downloads<\/h3>\n<h4>AMiner Papers:<\/h4>\n<ul>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/aminer\/aminer_papers_0.zip\">aminer_papers_0.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/aminer\/aminer_papers_1.zip\">aminer_papers_1.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/aminer\/aminer_papers_2.zip\">aminer_papers_2.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<\/ul>\n<h4>MAG Papers:<\/h4>\n<ul>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/mag\/mag_papers_0.zip\">mag_papers_0.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/mag\/mag_papers_1.zip\">mag_papers_1.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/mag\/mag_papers_2.zip\">mag_papers_2.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/mag\/mag_papers_3.zip\">mag_papers_3.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/mag\/mag_papers_4.zip\">mag_papers_4.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/mag\/mag_papers_5.zip\">mag_papers_5.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/mag\/mag_papers_6.zip\">mag_papers_6.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/mag\/mag_papers_7.zip\">mag_papers_7.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<li><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academicgraphv2.blob.core.windows.net\/oag-v1\/mag\/mag_papers_8.zip\">mag_papers_8.zip<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/li>\n<\/ul>\n<h3>Data Description<\/h3>\n<p>The detailed description of data is presented in this section.<br \/>\nFor\u00a0<strong>Linking relations<\/strong>, each linking pair is an \u201cID to ID\u201d pair. More specifically, its JSON schema is:<br \/>\n<code><br \/>\n{<br \/>\n\"mid\": \"xxxx\",<br \/>\n\"aid\": \"yyyy\"<br \/>\n}<br \/>\n<\/code><br \/>\nwhere \u201cmid\u201d is MAG paper ID and \u201caid\u201d is AMiner paper ID.<\/p>\n<p>For data set\u00a0<strong>MAG papers<\/strong>\u00a0and\u00a0<strong>AMiner papers<\/strong>, each paper is a JSON object. Its data schema is:<\/p>\n<table style=\"border-spacing: inherit;border-collapse: collapse\" width=\"50%\">\n<tbody>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>Field Name<\/strong><\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>Field Type<\/strong><\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>Description<\/strong><\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\"><strong>Example<\/strong><\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">id<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">MAG or AMiner ID<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">53e9ab9eb7602d970354a97e<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">title<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">paper title<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">Data mining: concepts and techniques<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">authors.name<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">author name<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">Jiawei Han<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">author.org<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">author affiliation<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">department of computer science university of illinois at urbana champaign<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">venue<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">paper venue<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">Inteligencia Artificial, Revista Iberoamericana de Inteligencia Artificial<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">year<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">int<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">published year<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">2000<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">keywords<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">list of strings<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">keywords<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">[\u201cdata mining\u201d, \u201cstructured data\u201d, \u201cworld wide web\u201d, \u201csocial network\u201d, \u201crelational data\u201d]<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">fos<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">list of strings<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">fields of study<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">[\u201crelational database\u201d, \u201cdata model\u201d, \u201csocial network\u201d]<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">n_citation<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">int<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">number of citation<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">29790<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">references<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">list of strings<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">citing papers\u2019 ID<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">[\u201c53e99ef4b7602d97027c2346\u201d, \u201c53e9aa23b7602d970338fb5e\u201d, \u201c53e99cf5b7602d97025aac75\u201d]<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">page_stat<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">start of page<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">11<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">page_end<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">end of page<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">18<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">doc_type<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">paper type: journal, book title\u2026<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">book<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">lang<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">detected language<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">en<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">publisher<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">publisher<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">Elsevier<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">volume<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">volume<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">10<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">issue<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">issue<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">29<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">issn<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">issn<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">0020-7136<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">isbn<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">isbn<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">1-55860-489-8<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">doi<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">doi<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">10.4114\/ia.v10i29.873<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">pdf<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">pdf URL<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">\/\/static.aminer.org\/upload\/pdf\/1254\/ 370\/239\/53e9ab9eb7602d970354a97e.pdf<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">url<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">list<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">external links<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">[\u201chttp:\/\/dx.doi.org\/10.4114\/ia.v10i29.873\u201d, \u201chttp:\/\/polar.lsi.uned.es\/revista\/index.php\/ia\/ article\/view\/479\u201d]<\/div>\n<\/td>\n<\/tr>\n<tr>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">abstract<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">string<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"center\">abstract<\/div>\n<\/td>\n<td style=\"padding: inherit;border: inherit\">\n<div align=\"left\">Our ability to generate\u2026<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>For example:<br \/>\n<code><br \/>\n{<br \/>\n\"id\": \"53e9ab9eb7602d970354a97e\",<br \/>\n\"title\": \"Data mining: concepts and techniques\",<br \/>\n\"authors\": [<br \/>\n{<br \/>\n\"name\": \"jiawei han\",<br \/>\n\"org\": \"department of computer science university of illinois at urbana champaign\"<br \/>\n},<br \/>\n{<br \/>\n\"name\": \"micheline kamber\",<br \/>\n\"org\": \"department of computer science university of illinois at urbana champaign\"<br \/>\n},<br \/>\n{<br \/>\n\"name\": \"jian pei\",<br \/>\n\"org\": \"department of computer science university of illinois at urbana champaign\"<br \/>\n}<br \/>\n],<br \/>\n\"year\": 2000,<br \/>\n\"keywords\": [<br \/>\n\"data mining\",<br \/>\n\"structured data\",<br \/>\n\"world wide web\",<br \/>\n\"social network\",<br \/>\n\"relational data\"<br \/>\n],<br \/>\n\"fos\": [<br \/>\n\"relational database\",<br \/>\n\"data model\",<br \/>\n\"social network\"<br \/>\n],<br \/>\n\"n_citation\": 29790,<br \/>\n\"references\": [<br \/>\n\"53e99ef4b7602d97027c2346\",<br \/>\n\"53e9aa23b7602d970338fb5e\",<br \/>\n\"53e99cf5b7602d97025aac75\"<br \/>\n],<br \/>\n\"doc_type\": \"book\",<br \/>\n\"lang\": \"en\",<br \/>\n\"publisher\": \"Elsevier\",<br \/>\n\"isbn\": \"1-55860-489-8\",<br \/>\n\"doi\": \"10.4114\/ia.v10i29.873\",<br \/>\n\"pdf\": \"\/\/static.aminer.org\/upload\/pdf\/1254\/370\/239\/53e9ab9eb7602d970354a97e.pdf\",<br \/>\n\"url\": [<br \/>\n\"http:\/\/dx.doi.org\/10.4114\/ia.v10i29.873\",<br \/>\n\"http:\/\/polar.lsi.uned.es\/revista\/index.php\/ia\/article\/view\/479\"<br \/>\n],<br \/>\n\"abstract\": \"Our ability to generate and collect data has been increasing rapidly. Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge. Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data? including stream data, sequence data, graph structured data, social network data, and multi-relational data.\"<br \/>\n}<br \/>\n<\/code><\/p>\n<h3>Method and Evaluation<\/h3>\n<h4>Method<\/h4>\n<p>We obtain linking relations of two publication graphs by two steps:<\/p>\n<ol>\n<li>Use Microsoft Graph Search API to query each AMiner paper\u2019s title and obtain candidate matching papers for each AMiner paper.<\/li>\n<li>We match two papers if they have\n<ul>\n<li>very similar titles<\/li>\n<li>similar author names and<\/li>\n<li>same published year<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<h4>Evaluation<\/h4>\n<p>We random sampled\u00a0<strong>100,000<\/strong>\u00a0linking pairs and evaluated the matching accuracy. The number of truly matching pairs is\u00a0<strong>99,699<\/strong>\u00a0and the matching accuracy can achieve\u00a0<strong>99.70<\/strong>%.<\/p>\n<h2>Reference<\/h2>\n<p><strong>We kindly request that any published research that makes use of this data cites the following papers.<\/strong><\/p>\n<ul>\n<li>Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner: Extraction and Mining of Academic Social Networks. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD\u20192008). pp.990-998. [<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"http:\/\/keg.cs.tsinghua.edu.cn\/jietang\/publications\/KDD08-Tang-et-al-ArnetMiner.pdf\">PDF<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>] [<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"http:\/\/keg.cs.tsinghua.edu.cn\/jietang\/publications\/KDD08-Tang-et-al-Arnetminer.ppt\">Slides<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>] [<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"http:\/\/aminer.org\/\">System<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>] [<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"http:\/\/aminer.org\/RESTful_service\">API<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>]<\/li>\n<li>Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June (Paul) Hsu, and Kuansan Wang. 2015. An Overview of Microsoft Academic Service (MAS) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW \u201915 Companion). ACM, New York, NY, USA, 243-246. [<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/an-overview-of-microsoft-academic-service-mas-and-applications-2\/\">PDF<\/a>][<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/academic.microsoft.com\/\">System<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>][<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" rel=\"noopener noreferrer\" target=\"_blank\" href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/cognitive-services\/academic-knowledge\/home\">API<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>]<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n<\/article>\n","protected":false},"excerpt":{"rendered":"<p>Open Academic Graph (OAG) is a large knowledge graph unifying two billion-scale academic graphs:\u00a0Microsoft Academic Graph\u00a0(MAG) and\u00a0AMiner. In mid 2017, we published OAG v1, which contains 166,192,182 papers from MAG and 154,771,162 papers from AMiner (see below) and generated 64,639,608 linking (matching) relations between the two graphs. This time, in OAG v2, author, venue and [&hellip;]<\/p>\n","protected":false},"featured_media":241295,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556,13563,13555],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-703531","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-research-area-data-platform-analytics","msr-research-area-search-information-retrieval","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[390014,712918],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/703531","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":21,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/703531\/revisions"}],"predecessor-version":[{"id":712936,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/703531\/revisions\/712936"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/241295"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=703531"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=703531"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=703531"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=703531"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=703531"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}