WikiBABEL: Community Creation of Multilingual Data

  • A Kumaran ,
  • K Saravanan ,
  • Sandor Maurice

the WikiSYM 2008 Conference, Porto, Portugal |

Published by Association for Computing Machinery, Inc.

In this paper, we present a collaborative framework – wikiBABEL – for the efficient and effective creation of multilingual content by a community of users. The wikiBABEL framework leverages the availability of fairly stable content in a source language (typically, English) and a reasonable and not necessarily perfect machine translation system between the source language and a given target language, to create the rough initial content in the target language that is published in a collaborative platform. The platform provides an intuitive user interface and a set of linguistic tools for collaborative correction of the rough content by a community of users, aiding creation of clean content in the target language. We describe the architectural components implementing the wikiBABEL framework, namely, the systems for source and target language content management, mechanisms for coordination and collaboration and intuitive user interface for multilingual editing and review. Importantly, we discuss the integrated linguistic resources and tools, such as, bilingual dictionaries, machine translation and transliteration systems, etc., to help the users during the content correction and creation process. In addition, we analyze and present the prime factors – user-interface features or linguistic tools and resources – that significantly influence the user experiences in multilingual content creation.