The rapid urbanization has motivated extensive research on urban computing. It is critical for urban computing tasks to unlock the power of the diversity of data modalities generated by different sources in urban spaces, such as vehicles and humans. However, we are more likely to encounter the label scarcity problem and the data insufficiency problem when solving an urban computing task in a city where services and infrastructures are not ready or just built. In this paper, we propose a FLexible multimOdal tRAnsfer Learning (FLORAL) method to transfer knowledge from a city where there exist sufficient multimodal data and labels, to this kind of cities to fully alleviate the two problems. FLORAL learns semantically related dictionaries for multiple modalities from a source domain, and simultaneously transfers the dictionaries and labelled instances from the source into a target domain. We evaluate the proposed method with a case study of air quality prediction.