Identifying Enrichment Candidates in Textbooks

  • Rakesh Agrawal
  • Sreenivas Gollapudi
  • Anitha Kannan
  • Krishnaram Kenthapadi

International World Wide Web Conference (WWW) |

Published by ACM

Many textbooks written in emerging countries lack clear and adequate coverage of important concepts. We propose a technological solution for algorithmically identifying those sections of a book that are not well written and could benefit from better exposition. We provide a decision model based on the syntactic complexity of writing and the dispersion of key concepts. The model parameters are learned using a tune set which is algorithmically generated using a versioned authoritative web resource as a proxy. We evaluate the proposed methodology over a corpus of Indian textbooks which demonstrates its effectiveness in identifying enrichment candidates.