Information about urban air quality, e.g., the concentration of PM2.5, is of great importance to protect human health and control air pollution. While there are limited air-quality-monitor-stations in a city, air quality varies in urban spaces non-linearly and depends on multiple factors, such as meteorology, traffic volume, and land uses. In this paper, we infer the real-time and fine-grained air quality information throughout a city, based on the (historical and real-time) air quality data reported by existing monitor stations and a variety of data sources we observed in the city, such as meteorology, traffic flow, human mobility, structure of road networks, and point of interests (POIs). We propose a semi-supervised learning approach based on a co-training framework that consists of two separated classifiers. One is a spatial classifier based on an artificial neural network (ANN), which takes spatially-related features (e.g., the density of POIs and length of highways) as input to model the spatial correlation between air qualities of different locations. The other is a temporal classifier based on a linear-chain conditional random field (CRF), involving temporally-related features (e.g., traffic and meteorology) to model the temporal dependency of air quality in a location. We evaluated our approach with extensive experiments based on five real data sources obtained in Beijing and Shanghai. The results show the advantages of our method over four categories of baselines, including linear/Gaussian interpolations, classical dispersion models, well-known classification models like decision tree and CRF, and ANN.