Automatic Identification System (AIS) based Ship Heading Prediction using Artificial Neural Network and Wide Genetic Algorithm

International Maritime Organization (IMO) requires ship with more than 300 gross tonnages to have Automatic Identification System (AIS) onboard. AIS is used as a navigational aid to ensure the safety of ship operation including collision avoidance, vessel tracking, and accident investigation. AIS Transponder is installed onboard ship and transmitted data will be received by the base station as AIS data receiver. AIS Transponder provides information for instance a unique identification, position, course, and speed. Then those data will be utilized for many applications. Ship’s movement information obtained from AIS such as position and heading can be utilized as an early warning to avoid the event of the collision. In some cases, ship heading data is unidentified caused by transponder incompatibility and mistakenly conducting setup of AIS transponder. To solve the problem, extrapolation method can be used to predict the ship’s heading. However, this method has the disadvantage since it only considers the location of ships while the distinctive area near the port is ignored. This paper aims to develop a method to predict the ship heading based on AIS historical data of ship’s movement using Artificial Intelligence (AI). Artificial Neural Network (ANN) has been chosen as AI prediction model. There are two pre-processing should do before training process such a down sampling and sliding window. Wide Genetic Algorithm (WGA) is used for ANN training process to create ANN model. WGA-ANN computes the ship heading by predicting the next geolocation of ship.


Introduction
As an archipelagic country, Indonesia consists of large and small islands and sea transportation takes a larger load scale in comparison to other modes of transport. One attempt to assure the safe operation, ships are monitored by Vessel Tracking System (VTS) that available in ports. To monitor and to identify the location of ships, International Maritime Organization (IMO) requires that ships with capacity more than 300 gross tonnages must be installed by Automatic Identification System (AIS).
Utilization of AIS as a digital navigation system has been widely used in many applications to solve a problem that appear due to the operation of ships. Not only to assure the safe navigation of the ship, AIS data is able to predict the dispersion of emission due to the operation of main and auxiliary engines of the ship [1]. Supporting the Port State Control Officer (PSCO) is ordered to select whether a ship is subject to inspect or not while entering the port, the AIS data is utilized to develop inspection scores [2]. The inspection scores help the PSCO to prioritize which ships should be taken into consideration for further inspection rather than doing the selection randomly. The utilization of AIS data is used for saving facilities due to the existence of both floating and subsea facilities in water area also have been conducted by some researchers. AIS data utilization to be used for conducting a risk assessment for subsea facilities in Madura Strait is done by Mulyadi et.al by defining the safe distance of dropping anchor of the ship [3]. AIS historical data based on around area Tanjung Perak Surabaya harbor has used to analyze the probability of collision course ship when is maneuvering in ship trajectory [4].
AIS data have two data types for instance static and dynamic data. Static data consists of ship's unique information and dynamic data consist of an information that changed due to the movement of the ship. Dynamic data are often not filled by a crew aboard since they do not understand how to utilize and manage the AIS [5]. One of dynamic data is that often not being filled by ship's crew is a ship heading. In some cases, the occurrence of collisions between ships often occurs because the heading is invalid or empty. The ship's heading can be predicted by the method extrapolation but it is disgraced the region, for example, outer area harbor, route harbor ship, parking harbor and gate harbor. The inaccurate prediction can appear since the heading is sensitive to the new ship position [6].
AIS historical data is one of big data type [7]. AIS historical data is needed to preprocessing before analyzing with Machine Learning (ML). There are two preprocessing such Down Sampling (DS) and Sliding Window (SW). Those methods are used to adjust the AIS data into ML. AIS data duplication is reduced by DS. AIS data is moved data with SW based on window size. ML utilizes technology and computer intelligence to perform complex calculation and has been widely using to conduct image pattern recognition [8], car pattern classification [9], and signal prediction [10]. Artificial Neural Network's (ANN) method is one of a popular prediction method for training based on input data and mathematical modeling. ML method also has been applied to predict application in marine technology such as tidal wave [11], wave height [12] and ship's heading [13]. However, ANN has used Backpropagation (BP) for training process [14]. BP has a weakness in training process since sometimes ANN will generate a random weight value that closes to local optimum [15]. This weakness can be solved by replacing BP uses the heuristic methods. This method is harder to trap into the local optimum. Genetic Algorithm is one of the heuristic methods. GA is able to avoid local optima. GA creates the population of agents is able to find as many as the solution in space. GA can be utilized to optimize ANN's weight value that is meant GA is able for the training process. Modified GA aims to increase the exploitation and the exploration of the solution space. Wide Genetic Algorithm Artificial Neural Network (WGA-ANN). WGA-ANN has been used to predict the soil liquefaction [16]. WGA is chosen to train ANN because WGA method able to assure that solution will not be trapped in the local optimum. Therefore, the solution obtained from WGA is globally optimum.
In this paper, the heading prediction is calculated based on historical data of previous position (latitude and longitude). To reduce duplication of latitude and longitude data, DS method is employed while SW is used to input the latitude and longitude to WGA-ANN. The process of WGA-ANN is while training process will assure that solution will not trap in the local optimum. In that WGA-ANN is able to balance the weight of generated ANN. The result of WGA-ANN is the new latitude and longitude as predicted future ship heading.

Automatic Identification System
AIS transmits a message or data every 6 seconds and received by AIS receiver. The AIS data is divided into two AIS data such as static and dynamic data Table I  One of type data AIS is static data, a unique information is not changed even the ship is moved or is not moved. Data are manually inputted in AIS transponder at the first time of installation. Dynamic data are changed and updated with the movement of the ship. Dynamic data recorded by VTS is classified as the type of big data or streaming data or real data [18]. The ship position is based on parameters such time, accuracy, latitude, and longitude. Streaming data of location ship nonstop has received by VTS.

Sliding Window
AIS data from small data is able to be big data for handling that problem preprocessing is needed to reduce or partite the data. There are two processes such a DS [19] and SW [20]. DS is used for reducing AIS data duplication because AIS data is streaming into storing VTS. SW is used to move AIS data based on window size. SW is used to make data is able to input ANN. The best choice window size is affected to ANN training process. There are two sides after SW AIS data such actual and train data. Actual data is used from ground truth to the result of the prediction. Train data is used to construct ANN model after training process.

Wide Genetic Algorithm Artificial Neural Network
ANN has been the popular method for researchers to help them to solve prediction problem [21]. Architecture ANN has layer and neuron. The ANN layer is divided into three such an input, hidden and output. The ANN neuron is placed to compute latitude and longitude with the weight value. Each layer has a neuron and have a link one of another. The activation function is transformed value into output result. However, ANN is trained by BP has a limitation in the training process. This limitation can be improved using the classical method called GA. This method has been implemented to improve the limitation that may appear while using ANN [22]. The classical GA consists of three operators such as selection, crossover, and mutation [23]. Steady State Genetic Algorithm Worst Replacement (SSGAWR) is one type of GA with modification in the selection operator. SSGA is trying to fix the population developed in the GA by replacing only the bad gen with good gen [24]. Wide Genetic Algorithm Artificial Neural Network (WGA-ANN) is inspired by SSGAWR. WGA-ANN is developed by modifying three operators in GA consist of selection, crossover, and mutation to make balancing exploration and exploitation in solution space. The result predicted liquefaction with WGA-ANN are showed by 1.5 times more than is used BP and GA and 4% Median Absolute Percentage Error (MdAPE) less than BP and GA [16].
The WGA-ANN concept is showed in Figure1. Agents are contained in ANN weights, and the agents will be generated when data have been inputted into ANN. Agents is divided by gen while each gen represents by the weight. Each Agent has a fitness value from recording error prediction of actual and predicted value.
It is shown that selection operator is modified from N-Tournament Selection into Wide-N-Tournament Selection to select all good and several bad agents to improve the diversity of the population.
BLX-a multi-parent is a crossover applied in the WGA-ANN to obtain a better child from two best parent multiplication. Aggregate Mate Pool and Direct Mutation Recombination is proposed to increase mutation opportunities on each gen to produce the best gen. Wide-N-Tournament Selection flow is shown in Figure2. The agent has gens that represented by a weight and the fitness value. The list of gens will be selected using Wide N-Tournament concepts. The first time, gen will sort in ascending way. Gen is selected randomly from the list of gens. The selected gens that consist of 4 gens will be further selected using binary tournaments until two candidates as parents' pool and worst agent have been selected. Wide N-tournament helps to balance the result of gen when the distance of fitness value is large enough. In this concept, the worst agent will be used as input to other operators (crossover and mutation). Results from Wide-N-Tournament Selection are used for crossover BLX-a multi-parent, aggregate mate pool, and direct mutation. Both operator modifications aim to keep the diverse population unchanged because it only consists of the best agents. The diversity of the population can make WGA-ANN exploitation and exploration balancing.

Figure 3: Extrapolation
Aggregate Mate Pool and Direct Mutation creates a new best gen from best gen and by holding the worst gen that closes to best gen. This process will be looped until the population finds the convergence. The best and worst gens are made to find in the solution space. Those gens have balanced both exploitation and extrapolation. The convergence is stopped when the new gen is similar to the parent gen. The result training process WGA-ANN is given the optimal weight value to make minimal prediction error.

Heading Prediction
The heading prediction concept divided by two processes such a preprocessing and main process. The preprocessing is compiled AIS historical data and the main process is predicted heading value while ANN training process is executed till convergence. At the first time, data is screened out in order to assure that data used in ANN training is the correct data, where the data is not null data. Preprocessing data will be useful to hold an error before utilize to train the ANN. Even though the last data already manipulated and the result of data training is not good, the data will be fixed up in next training process. WGA-ANN will reform the ANN model. In this method, the reform of ANN become more difficult when the first data is better than the next data. WGA is the actor to find a global optimum when ANN has trapped in the local optimum. WGA is balancer of ANN weights.
Preprocessing are divide to two methods DS and SW. The ship location is illustrated in Figure4. The ship location is included latitude, longitude, and time or timestamp. The position of the ship be assumed by t is a timestamp. Each ship is moved also latitude and longitude are updated that will be predicted for one ahead or more than one.   While VTS is stored list latitude and longitude that are possible storing the similar latitude and longitude. That happens because the ship position stays in the same area for couple minute or more than a minute. That event is showed that ship is not moved. The ship is moved when the location is updated and timestamp is not same. DS has used to reduce AIS historical data because the list latitude and longitude have stored same data at least one day. Table II and III show the latitude and longitude used in WGA-ANN as training to the system. Table II is data training that used to DW in order to reduce the duplication position. Table III is data testing to check the model that has been trained by WGA-ANN whether already fit to use as ship heading prediction or not. SW is moved data by window size N from index zero to the end index. SW is illustrated in Figure5 the latitude and longitude are grouped by arraying window size.
Each latitude and longitude grouping are split into making data training and testing. Data training is used to input into ANN. Data testing is used for validating ANN model. WGA is implemented to make ANN training process is able to find optimal weight value. The main process has gotten the weight value. While processing input latitude and longitude into ANN neuron, WGA has generated optimal weight value by doing a wide tournament, recombination BLX-a multi-parent and aggregate mate pool and direct mutation.  value has gotten by calculating prediction error. It deviates both of latitude and longitude prediction with ground truth. The new latitude and longitude closes with the ground truth have approved that WGA has success to generate weight value. This process will be looped till the new gen is similar with the parent. The optimal weight value is multiplied by latitude and longitude. The output ANN results are latitude and longitude with minimal margin error. The resulting latitude and longitude prediction for a couple of minute or hour those are used for computing distance of ship to the next tracking of the ship. The benefit illustration of heading prediction is into the real case has shown in Figure6.
Heading prediction computes forward the ship length routing by latitude and longitude one times or more than one times. The first prediction is only computing short ship length routing. The third prediction is able to compute long ship length routing. Each heading prediction is able to use for showing the early warning system anti-collision into the ship. The ship crew will aware nearly the ship whom another ship comes closer. Figure 6: Anti-Collisiom

Conclusion
Heading prediction is based on AIS historical data such a latitude and longitude. The extrapolation method is the ignorance in the area, but ANN is able to adapt to the new area. ANN can train by itself using WGA. Firstly, latitude and longitude should preprocess to construct data. The preprocessing is included down sampling and sliding window. Second, WGA has generated the optimal weight value while training process till accepted convergence. ANN model is used to predict new latitude and longitude. Heading can be computed using new latitude and longitude. The ship length routing is grepped by calculating from old position to new position. This paper already found the unique empty heading using WGA-ANN to train the system. WGA-ANN helps to train the system in order to find a ship pattern based on previous latitude and longitude of the ship. The main contribution of this paper is to predict the future heading of the ship as a solution to fill an empty heading that sometimes occurred in AIS data reporting. Furthermore, the list of predicted ship heading will be beneficial to be used for developing the new feature in the navigation system, such as developing an alert system to avoid collision between ships.