In this paper we address two related problems in multimodal local search application on mobile devices. First, correctly displaying the business names, and second, harvesting language model training data from inconsistently labeled corpus. We give quantitative investigation into the impact of common text normalization and language model training procedures. Our proposed new language model framework eliminated the need for inverse text normalization, or “pretty print” with supreme accuracy. We also demonstrate the same framework salvages, or cleans up, dirty language model training data automatically. Our new language model performs 25% more accurately and is 25% smaller in size.