Understanding Data Through Analysis of Structured Markup and Usage

  • Steven J. Altschuler ,
  • Edward Jung ,
  • Lani F. Wu

MSR-TR-99-101 |

Data and documents with semantic markup (such as in XML) allows for the construction of usage models that are human interpretable. A framework for performing analysis over semantic markup data in conjunction with usage information is described along with applications for performing automated clustering and “find-similar”. A strong emphasis on “human interpretability” is placed on the results of these analytical techniques.