Abstract

XML is a more desirable format for modeling and storing clinical data in EMR (Electronic medical record) applications for its extendibility; however, existing EMR systems either are built on top of RDBMS or file systems or lack of support for complex and large scale healthcare applications, such as treatment effectiveness analysis and procedure optimization. SAP Technology Lab, China is developing a clouds-enabled information appliance, Xbase, built on top of Hadoop, which is the first XML-based information appliance designed specifically for large scale and complex healthcare applications. XML presents a different set of challenges for query processing, indexing, parallelism, and distributed computing using existing Hadoop’s APIs as well as its HDFS storage infrastructure and MapReduce framework. In this paper, we describe system architecture and internal designs of Xbase as well as how the indexing is mapped to RDBMS and Hadoop. We also discuss why we select Hadoop over other candidates, such as Hbase, Google’s Bigtable, and Hive.

‚Äč