Research and Application of Machine Learning on Geographic Information System

In the big data era, an information system that is able to flexibly scale out, store mass data and quickly response to concurrent requests is particularly important. Despite the mature mining technologies on structured data, the utilization of unstructured data is still inadequate which results in the waste of data sources. Under this circumstance, this paper adopts machine-learning technologies to build a salable information system by analyzing Geographical landform data.


Introduction
The advent of "Internet Plus" concept provides us with an original thought to apply the Internet technologies in the research of DANXIA information system."DANXIA landform" is a special landform, which appeared in a specific historic period, which might help to reveal the features of crustal evolution.Currently, the research work on DANXIA landform mainly includes the collection of geological data, geomorphology data, physical geographic environment data and the analysis of the data collected.When the volumes of data reach a certain scale, the management of data will undergo severe efficiency declines with the increase of file volumes [1][2][3][4].Meanwhile, the overheads of maintenance retrieval will be tremendously augmented.
In this paper, we propose a new way to apply machine-learning technologies into the research to model the landform data, and establish an information system which is easy to store information in and easy to be extended.

Device Virtualization.
A neural network model designed for solving visual pattern recognition is proposed and named as Neocognition.A typical neural network model is comprised of large number of neurons.
Neural network consists of three layers.The first one is to take in the input signals.The second layer is comprised of three neurons, each of them accepting all the signals from the first layer and outputting a signal.The third layer consists of one neuron, accepting all the signals from the second layer and outputting the final result.The outputs of neurons on the second layer are demonstrated as the formulas (1), ( 2), (3).The final result is showed in the formula (4).
 i L   --The output signal of the i-th neuron on the L-th level.

 
L ji  --The output signal of the i-th neuron acts` as the corresponding influence factor of the input signal of J-th neuron on the L+1-th layer.

Cost Function.
Cost function is another crucial concept in neural network training [5], measuring the gap between the specific solution and the best solution towards a specific problem.Basically, the neural network training process is to enable the cost function to get the minimum result.The corresponding cost function of neural network is as formula (5).

System Design
DANXIA landform information system is comprised of many subsystems.The first problem to solve when we are developing the system is landform data modeling.DANXIA landform data embraces the following features: large volume of data, few connections among data, low consistency requirements and low security requirements.For these reasons, NoSQL database is adopted for this system.
The second is recognizing the DANXIA scenic spot that the given picture belongs to.The system design adopts convolution neural network to categorize the pictures, thus recognizing different places according to the given pictures.
The last part is recommending appropriate scenic spots to users.For this purpose, an optimized collaborating filtering algorithm, enabling the algorithm to consider other influence factors except the scenic spot when breaking down the scenic spot factors.

Implementation
This part will focus on the implementation details of data persistence layer, and elaborate the implementation details of the automatic recognition and scenic spots recommendation functions.Guest OS configures the remote PCI devices using remote mounting strategy, transparent to user applications.We have also demonstrated the workflow of how the whole system works to show the feasibility of our design.Now, the implementation of the system prototype is working in progress.And some necessary experiments will be done to test the performance and practicality of our design in near feature.

Information System Architecture
Node.js is applied as the server.Node.js is a platform, which is created to enable fast and extendable web applications.
Express is the web applications structure applied in the Node.js.Its functions include: powerful request routing service, reliable Internet environment, flexible control over the status of web application by configuring environmental parameters.
AngularJS is a MVC frame in client side.It offers two-way data binding, automatically synchronizing data between View and Model.It also provides dependency injection function, cutting down the coupling degree of codes.
MongoDB is an open-source NoSQL Document-oriented database.In a Document-oriented database, all the data is stored as independent cells in the form of documents, making data storage more flexible.

Data acquisition and Pre-processing
View Spot Image Recognition System captures all the view spot images from MongoDB database to a set of original images.All the pictures in the set are tagged with different DANXIA landforms it belongs to.Meanwhile, the original data will be scaled up or down to fit a specified resolution requirement.Then the RGB color pictures are greyed to scale down the dimensionality of the input data without threatening the information volume of input data.

Model architecture and relevant parameters
View spot Image Recognition System recognizes images automatically by training the Convolutional Neural Network which can categorize the scenic images.The figure below is the model architecture that will be utilized.

Optimized View Spot recommendation System
Owing to the fact that traditional collaborative filtering algorithm cannot represent users' fancy grade towards a specific scenic spot, an optimized collaborative filtering algorithm is proposed.

Evaluation
The experiments are conducted in two parts: -The cost function which takes all the parameters of neural network as variables m --The number of data；   i y --The result of the i-th set of training data；  h --The corresponding compound function of the input and output of neural network；   i  --The input of the i-th set of training data has been constructed based on ExpressEther .