Research on Spatialization of Urban Area Based on Deep Learning

This paper takes Zhengzhou City, Henan Province as the research area, NLD (Night Light Data) high-resolution remote sensing image of 2017 as the data source. Two different supervised algorithms (Support Vector Machine & Deep Learning) was used for classification. During Deep learning, two kinds of semantic segmentation network models are selected: FCN (Full Convolution Neural Network) model, and U-Net model to classify source data and analyze the effects of different semantic segmentation networks on classification accuracy. We calculate the urban area of 460.34 square kilometers, 447.28 square kilometers, and 452.57 square kilometers by SVM (Support Vector Machine) algorithm, U-Net model and FCN model, while the urban area of 437.60 square kilometers in 2018 was announced by Zhengzhou Municipal Bureau of Statistics. The results showed that the classification accuracy of the SVM algorithm is 95.06%, the U-Net model reached 97.83%, and the FCN model had 96.69%, under the same conditions and similar spectral information. We found that the U-Net network model can get better classification results in areas with serious mixed features, both the semantic segmentation network models of the deep learning algorithm are more accurate than the SVM algorithm to the data released by the bureau of a statistic of Zhengzhou


Introduction
Domestic and foreign studies on night lighting data mainly focus on social-economic estimation and urban research direction, Night lighting data has a good interdisciplinary type, and it provides the possibility to explore new research fields [1].
To solve the salt noise filter problem of high-resolution image classification based on traditional pixel method, in recent years, object-oriented classification has become the main means of highresolution remote sensing image classification [2]. Xie et al. updates and traces urban changesets using object-based thresholding to detect large-scale urban change (Xie Y et al., 2017) by enhancing the time series of DMSP / OLS night light (NTL) data. Aiming at night lighting data, this project selects the support vector machine (SVM) algorithm to extract an urban built-up area based on night lighting data. The extracted urban built-up areas are compared and validated with Landsat 8 high-resolution data, and the area of urban built-up areas obtained by the SVM algorithm is corrected and analyzed (Chen Z et al., 2017). Chen et al. used the intensity of the night light (NTL) recorded by satellite sensors to identify the city center successfully by developing a local contour tree method, and demarcated the corresponding boundaries to determine their spatial relationship with the Shanghai metropolitan area [3].
On the analysis of urban expansion characteristics, Zhang Shaonan and others analyzed the urban expansion characteristics of Zhengzhou from 1996 to 2016 based on GIS using the ISO-fan analysis, compactness, expansion intensity index and quantitative model of gravity center migration [4].
Deep learning has become a hot spot in the development of large data and artificial intelligence on the Internet. The biggest difference between deep learning and traditional machine learning methods is that it has the characteristics of automatic learning from large sample data sets rather than manual design. Also, the deep network structure makes it have strong learning and expression ability [5]. Convolutional neural network (CNN) is one representative algorithm of deep learning, with its advantages of automatic feature extraction and classification, it has significant advantages in the field of remote sensing [6][7][8]. At present, a full convolutional network (FCN), as a very important network model in semantic segmentation research, is widely used in high-resolution remote sensing images [9][10][11][12][13][14][15][16]. FCN improves the traditional convolution neural network models (AlexNet, VGG-Net, Google LeNet, etc.), it replaces the full connection layer with convolution layer, which contains huge parameters, and accepts image input of any size, the output results are consistent with the size of the input image, it is a real end-to-end network [17]. Many scholars have reformed the FCN model. Badrinarayanan et al. proposed the Segnet network, it follows the idea of image semantics segmentation based on FCN. The network combines the characteristics of codingdecoding structure and jumping network, which enables the model to obtain a more accurate output feature map and more accurate classification results under the condition of limited training samples [18]. Ronneberger O et al. Proposed a U-Net network based on FCN. The most important modification of this network is that there are a large number of characteristic channels in the upper sampling part, which allows the network to propagate the context information to a higher resolution layer. The expansion path and contraction path are symmetrical, forming a U-shaped structure, enabling the network to run with fewer training images and to perform more accurate segmentation operations [19]. But most of these studies are about a single network model. It is still a question about which model training can get better training results for image sample input. For this reason, this paper analyses the influence of different network models on the classification effect of highresolution remote sensing images.

Research Area
Zhengzhou is located between 34°16′-34°58′ north latitude and 112°42′-114°14′ east longitude. It is located in the plain of the Yellow River and in a typical temperate monsoon climate. The difference in the four seasons is obvious. Zhengzhou is the provincial capital of Henan and the economic center of Henan. Zhengzhou has convenient transportation and is a hub of China's railway transportation [20]. Since 1992, Zhengzhou's industry has developed rapidly. Especially with a large number of enterprises such as Foxconn, which settled in Zhengzhou, a large number of migrants have poured into Zhengzhou. Frequent human activities have led to an increasingly significant heat island effect in Zhengzhou. The population of Zhengzhou reached 9.51 million in 2017, and the population density reached 1327 people per square kilometers.

Data
NPP night light data: NPP night lighting of 2017 were used. NPP night lighting data comes from NGDC. NPP satellites have been officially put into use since April 2012, but NPP nighttime lighting data does not filter out noise interference other than flare, but the advantages of this sensor are mainly reflected in the wide-angle used. Radiation detector [21] because the detector can eliminate the supersaturation of the light, the image clarity of the satellite is improved a lot. The NPP data is one per month. The time starts in April 2012. When processing data, it is necessary to convert the projection of NPP nighttime light data into an area of Albers projection. This study is devoted to the use of NPP-VRRIS nighttime remote sensing data to extract the approximate range of Zhengzhou City and can be fitted with the built-up area of Zhengzhou to obtain a correlation. This paper can use the 2017 MODIS data as a mask to tailor the 2017 NPP data to better eliminate the background noise of NPP data. The nighttime date of 2017 is shown in figure 2.

Auxiliary Data
The supplementary data in this paper is the data of population, population density, urbanization rate, etc. in the Zhengzhou Municipal Bureau of Statistics. By computing the average DN value of the nighttime light data [22], the relationship between nighttime lighting and population is obtained. The Zhengzhou City Statistics Bureau published the relevant relationship between the Zhengzhou City population, built-up area, urbanization rate, population density and the average DN value of DMSP/OLS nighttime lighting data.

Methodology
Get Night-Light Data of 2017 of Zhengzhou city. After trimming, denoising and correction of that Night-Light data, apply two different supervised classification algorithms (SVM & Deep Learning) on that data to get sorted data. In a deep learning algorithm, we use two semantic segmentation classification model's (U-Net & FCN). Then compare all the resultant sorted data, calculated through algorithms with the Annual Built-Up area and find out the more accurate and similar data.
The proposed technical road-map for this research is shown in figure 3.

SVM Algorithm
A support vector machine (SVM) is a machine learning algorithm that analyzes data for classification and regression analysis. SVM is a supervised learning method that looks at data and sorts it into one of the two categories. An SVM outputs a map of the sorted data with the margins between the two as far apart as possible. The classification process of the SVM algorithm is shown in figure 5.

Semantic Segmentation Network Classification Method:
Remote sensing image classification method based on deep semantic segmentation network mainly includes two parts: network model training and prediction of classification. The training data set is trained by the network model, and adjust the hyper parametric optimization model until converges. Finally, apply the trained model to predict the test set, obtain the classification results. The classification process is shown in Figure 6.

FCN Full Convolution Neural Network Model:
In the FCN classification model, all the connection layers in traditional CNN are changed into convolution layers. The image is sampled up by deconvolution operation, mainly including the convolution layer, activation layer, and deconvolution layer. The convolution layer uses a convolution core as a template, slides on the image and calculates the value of the corresponding point in the center of the template. The activation layer transforms the linear output of the former layer into a non-linear one by using activation function, which enhances the network's characterization ability. Deconvolution carries out bi-linear sampling on a feature map, which is implemented in the form of convolution and restored it to the size of the original image. FCN uses a jumping architecture, combines the deep semantic information with the shallow representation information, assigns a semantic label to each pixel in the image, and obtains accurate and precise classification results. The FCN architecture diagram for 2D segmentation is shown in Figure 7.

U-Net Model
In order to locate accurately, the U-Net model combines the high-pixel features extracted from the shrinkage path with the new feature map in the up-sampling process to preserve some important feature information in the down-sampling process to the greatest extent. In order to make the network structure run more efficiently, the structure cancels the fully connected layers, greatly reduces the training parameters, and benefits from the special U-shaped structure can retain the information in the picture very well. The structure of the U-Net network model is shown in Figure 8.

Results
Training results of year 2017 from two different supervised algorithms are shown in Figure 9.   Table 1 shows the statistical area of the yearbook and calculated area from different algorithm.  Table 2 shows the classification accuracy of each category.

Discussion
According to Table I and Table II

Conclusion
Based on commonly used supervised classification algorithms and semantic segmentation models, this paper classifies and predicts high-resolution remote sensing night time images. Taking Zhengzhou City, Henan Province as the research area, use two different supervised classification algorithms SVM and deep learning algorithm to predict and classify and analyze the corresponding results. It is concluded that under the same conditions, the overall classification accuracy of U-Net is the highest, FCN is on 2nd and SVM is in the 3rd place. We found both the calculated results of the semantic segmentation network models of the deep learning algorithm have more accuracy than the SVM algorithm results in the data released by the bureau of a statistic of Zhengzhou. Therefore, we conclude that deep learning algorithm results are more accurate then SVM algorithm result, because through the U-Net network model we got better classification results in areas with serious mixed features. From the result, we also conclude that it is necessary to further improve the correlation between adjacent pixels and consider the influence of different algorithms and network models on the classification accuracy of remote sensing images under different sizes, and provides high accuracy and applicability image pre-processing flow for remote sensing image classification based on deep learning