Research on UAV Path Planning Based on Deep Learning in Wireless Communication Scenarios

With the development of science and technology, more and more drones are applied in the military and civil fields, and their path planning has also attracted many scholars to study. Drone path planning is crucial important in the process of performing tasks. Based on this, this article studies the path planning of drones based on deep learning. First, the main technologies of wireless communication technology and their development trends are introduced. The key technologies include key agreement technology, cooperative interference technology, and confidential coding technology. Then it expounds the development process of deep learning, paves the way for its application, and then analyzes the current research status of drone path planning. Finally, a drone path planning scheme is designed to promote the development of drone technology.

ignored in wireless communication technology with its special technical characteristics. Mobile video broadcasting is a new wireless service, but with technology improvement, it is becoming one of the hottest wireless applications.

Key agreement technology
Compared with general communication methods, the biggest feature of wireless communication is its reciprocity. In short, during the communication process, only the communicating parties can know the special transfer function, which greatly increases the communication to a certain extent degree of confidentiality. In some special cases, the two parties in the communication will also use the above process principle as the key for transportation. At this time, the confidential communication is carried out directly without any conditions. The above is the so-called key agreement technology.

Cooperative interference technology
Cooperative interference technology is one of the important technologies to prevent eavesdroppers from performing eavesdropping. Generally speaking, only when the capacity of the main channel is greater than the capacity of the eavesdropping channel, the eavesdroppers cannot perform eavesdropping. In view of this, in order to improve the confidentiality of wireless communications, technicians change their perspective and go the other way. They choose to adopt cooperative interference technology and improve the noise of eavesdropping channels to achieve the same effect as blocking eavesdroppers.

Security coding technology
The purpose of confidential coding technology is the same as that of cooperative interference technology, both are to prevent criminals from achieving eavesdropping. However, in actual application of confidential coding technology, the coding length must reach a certain length to achieve the security effect. The security effect varies with the coding length. It also changes, so the security performance lacks stability [4]. Therefore, the security coding technology is less mature than other technologies, and technicians should further study this technology.

Wireless Channel
Channels are also referred to as "paths". One-way or two-way paths for transmitting and receiving between two points can be divided into two categories: wired and wireless.
The communication quality of wireless channels is much worse than that of wired channels. The typical signal-to-noise ratio of a limited channel is about 46dB, (the signal level is 40,000 times higher than the noise level). The signal-to-noise ratio fluctuation of an infinite channel usually does not exceed 2dB. The factors that cause fading are related to the environment.
The basic propagation mechanism of a wireless channel is as follows: ① Direct radiation: the propagation of wireless signals in free space; ② Reflection: When electromagnetic waves encounter objects that are much larger than the wavelength, reflection occurs. Reflection generally occurs on the surface of the earth, buildings, and walls; ③ Diffraction: Diffraction occurs when the wireless path between the receiver and the transmitter is blocked by a sharp object edge; ④ Scattering: Scattering occurs when there are objects smaller than the wavelength in the wireless path and the number of such obstacles in the unit volume is large. Scattering occurs on rough surfaces, small objects or other irregular objects, such as leaves, lampposts, etc. cause scattering.

Development Trends of Wireless Communication
The wireless communication technology has the following development trends: (1) Seamless network coverage, that is, users can access the network at any time and any place.
(2) Broadband is an inevitable trend in the development of future communications. Narrow-band, low-speed networks will gradually be replaced by broadband networks.
(3) The convergence trend has accelerated significantly, including: technology convergence, network convergence, and business convergence.
(4) The data rate is getting higher and higher, the spectrum bandwidth is getting wider, the frequency band is getting higher, and the coverage distance is getting shorter and shorter.
(5) Terminals are becoming more and more intelligent, creating conditions and means of implementation for the provision of various new services.
(6) Development from two directions-① Mobile network increases data services: The emergence of technologies such as 1xEV-DO and HSDPA has gradually increased the data rate of mobile networks, superimposed on the original mobile network, and the coverage can be continuous; in addition, the emergence of WiMAX has accelerated the development of new 3G enhanced development of technologies; ② Increased mobility of fixed data services: The emergence of technologies such as WLANs has increased data rates, the coverage of fixed networks has gradually expanded, and mobility has gradually increased; the success of mobile communications, broadband services, and WiFi has contributed to a variety of broadband wireless networks such as the birth of 802.16 / WiMAX access technology.

The Development of Deep Learning
When it comes to deep learning, it is natural to understand the concept of machine learning. Machine learning refers to "the behavior of a computer to automatically improve the system's own performance using experience", that is, the computer obtains new empirical properties by learning the inherent characteristics and hidden information in the data knowledge enables computers to make decisions or judgments like humans [5]. At present, the development of machine learning has taken more than 70 years, and experienced many climaxes and troughs. Nowadays, with the rapid development of computer hardware, the computing power has greatly increased. With computer upgrading, the boom of artificial intelligence is coming again, and its representative is the deep learning algorithm. Deep learning algorithm is by far the closest computer learning cognitive method to the human brain. It draws on the multi-layered structure of human brain neurons, connection interaction, distributed sparse storage and representation, and layer-by-layer analysis and processing of information have made great breakthroughs in many application areas such as speech and image recognition. The main feature of deep learning is the construction of deep artificial neural networks.
Warren MeCulloch and Walter Pitts proposed the neural network structure in 1943. Since then, machine learning with neural networks as the origin has begun to develop. In 1950, the great computer scientist Turing proposed the famous "Turing test", which made many in the computer field Scholars are gradually focusing on computer-based artificial intelligence. Cornell University professor Frank Rosenblatt proposed the concept of perceptron in 1957, the first computer neural network designed by humans. In 1962, a special neural structure in the the cat brain cortex was found to reduce the computational complexity, and a biological vision model came into being. This invention also laid the foundation for later convolutional neural networks. Marvin Minsky and Seymour Papert, two artificial intelligence leaders published the book ''Perceptron'' in the field of artificial intelligence in 1969. The theory proposed in this article for the basic ideas of machine learning, that is, the problem-solving algorithmic ability and computational complexity, still has far-reaching influence. In 1986, Rumelhart Hinton and Williams published the famous back propagation algorithm (BP) in the journal Nature. The algorithm basis for updating the parameters of the neural network model is determined. The proposed algorithm greatly reduces the calculation amount of the optimization problem. At present, the training of deep learning neural networks is based on the BP algorithm to update the parameters of the model. In 1989, Bell Experiment of the United States Professor Yann LeCun, a laboratory scientist, first proposed the computational model LeNet-5, which is also the earliest convolutional neural network model. At the same time, an efficient training method based on the BP algorithm was derived, and the training of the handwritten digital recognition network model was successfully completed. The network was the first artificial neural network to be successfully trained, and it is also one of the most common and widely used models in deep learning. During the period after the 1990s, statistical-based machine learning algorithms emerged endlessly.
Such as logistic regression, Naive Bayes, support vector machine (SVM), etc., these algorithms have solid mathematical theory as support, and are easy to derive and implement. The model is easy to learn the internal features from the training samples according to the formulated algorithm. Good results have been achieved in the field of intelligence, and many methods have been used to this day [6]. In 2006, scholars in the field of artificial intelligence Geoffrey Hinton and Ruslan Salakhutdino published an article that proposed a deep learning model. The main argument is that there are multiple hidden layers. The neural network has more superior feature learning capabilities than previous shallow neural networks; it initializes the weights layer by layer to overcome the difficulty of training to complete the overall network optimization. In 2012, the Hinton team won the ImageNet with a deep learning network model which marks the development of deep learning into a new era. With the study of deep learning by many scholars of artificial intelligence, deep learning has been used in many fields in recent years, such as image, text, sound, natural language, and behavior recognition. It has achieved amazing results and is still developing at a faster rate than expected. In 2015, the three giants in the field of learning jointly published a '' Deep Learning '' article in Nature Magazine, which further explained the principles, status quo, applications, and future development prospects of deep learning. Since then, deep learning has gradually entered the public's vision and has grown rapidly.

Research Status of Uav Path Planning
At present, many countries have specially established drone research institutions and conducted in-depth research on drone path planning issues and achieved fruitful results. The path planning of drones involves flight mechanics, operations research, automatic control, navigation and image processing, and many other disciplines and specialties, with strong comprehensiveness, high complexity, and difficult model building [7]. Most studies simplify the problem model and study it hierarchically. Generally, the path planning of drones can be divided into the following two layers: 1) Offline path planning, or global static path planning, that is, path planning based on known environmental information before flight and considering only static obstacles in the planning space, combined with established goals and constraints. Offline path planning algorithm requirements are low, and the focus is on obtaining the optimal path.
2) Online path planning, or dynamic path planning, that is, when unexpected conditions are encountered during the drone flight, and the original planned path is no longer applicable, according to the mission requirements and evasion strategies, with the help of joint sensor systems, logical control systems, etc. , real-time flight paths will re-plan. Online path planning requires rapid response to unexpected situations, and requires high timeliness of the algorithm. At the same time, you can consider re-planning paths as close to the optimal as possible while avoiding obstacles.
Algorithms are the core content of path planning research. Excellent path planning algorithms can not only ensure the optimal global path, but also accelerate the speed of path re-planning when encountering sudden conditions, thereby improving the overall efficiency [8]. Commonly used paths planning algorithms include genetic algorithms, probability mapping, particle swarm optimization, artificial potential field methods, etc. Among them, genetic algorithms (GA) refer to the genetic laws of survival of the fittest in nature, and through repeated iterations, such as selection, crossover, and mutation, finally reach approximate optimal solution. The algorithm has a simple process, has many advantages, can perform fast random searches, does not require high details of the problem, has strong robustness, and has good scalability. It can be easily combined with other algorithms. In the past, great progress has been made in the research of UAV path planning based on genetic algorithms, but there are still some shortcomings, such as slow convergence speed, premature fall into local optimum, etc. At the same time, because genetic algorithms involve the selection of multiple parameters, the convergence effect of the algorithm is different, if these parameters are not selected properly, the optimal solution will not be obtained. For example, the selection of fitness function, not only the flight cost value of a path needs to consider the feasibility of the path, so if it is not selected properly, it may not get the global optimum and converge to the local optimum. Some literatures have proposed a dual genetic algorithm mechanism, which is based on different fitness functions to reach two evolutionary goals, and perform path planning for static and dynamic threats in the environment, which can find the optimal path while avoiding threats. Ma Yunhong applies genetic algorithms to the solution of path planning problems and describes them in polar coordinates. The location of path points and threats improves the efficiency of path planning. Kavraki et al. proposed a probabilistic map method, which generates path points and constructs a connected map in free space, transforming the solution space of the path planning problem into a topological space. The complexity of the problem is not related to the complexity of the environment and the dimension of the planning space, but mainly depends on the complexity of the path search. The disadvantage of this method is the efficiency of the method when the obstacles are dense in the planning space or the narrow channels will become low [9]. In addition, because this method randomly samples the landmark nodes when constructing the connected graph, it is easy to cause the final search path deviates from the optimal. There is also a detailed description of the particle swarm optimization algorithm (PSO) in the literature, which promotes the development of the entire population by simulating the social behavior of information sharing among biological populations. The initial population of the PSO algorithm is randomly generated , it can be adjusted to the optimal solution faster by adjusting according to the fitness value.
Khatib first proposed the artificial potential field (APF) method in 1986, which solves the obstacle avoidance problem of robots by artificially setting the gravitational field and the repulsive field. The basic idea of this method is to abstract the motion of the drone in the task environment as a kind of motion in the artificial field to construct a gravitational field for the target point, so that it generates "gravity" for the drone, and the threat area has a repulsive field, which generates "repulsive force" for the drone. The resultant force is used to guide the drone's flight. This method can achieve fast control, so it is widely used for real-time motion control. The path planned by APF is safe and smooth, but when the threat is close to the target area, the repulsive force is greater than the gravity, and the drone in target position cannot be reached, and this method may fall into a local optimum [10]. At present, the research on the path planning problem of a single drone has matured, but with the development of science and technology, people's exploration and research in various unknown fields have also with continuous deepening and rapid problem growth, a single drone can no longer meet the requirements of increasingly complex tasks, and in order to improve the efficiency of task execution, multiple drones are often required to work together. For collaborative path planning, the literature adopts a method to reduce the problem dimension to solve the multi-machine collaborative path planning problem. The specific method is to divide the entire problem into three levels, which simplifies the entire planning problem and reduces the amount of calculation. The literature proposes a path equilibrium. The ant colony planning algorithm uses blocking factors to resolve path collision constraints on space collaboration, and introduces variable collaborative range to resolve distance differences in time collaboration, thereby planning collaborative flight paths for drone formations; Ye Yuanyuan proposed a method of co-evolution to achieve coordinated path planning for drone formations, this method maps paths to evolutionary computing individuals, builds problem models based on co-evolutionary computing, and focuses on issues such as individual fitness design. All of the above methods can achieve the coordinated path planning of formations below certain conditions, but there are more or less certain defects: some designs are complex, the amount of calculation is large, and sometimes they cannot meet the time requirements; some only consider the avoidance of fixed obstacles and cannot meet the planning requirements in a dynamic environment ; Some only consider multi-machine single-target collaborative path planning, but do not consider multi-machine multi-target factors together. Sometimes, in order to achieve sudden attack or disperse hostile fire, drone attacks orientation of choice is also to be considered.

Uav Path Planning Based on Deep Learning
This article considers the design of an orthogonal frequency division multiple access cellular drone system. There are n drones in the system to perform real-time sensing tasks. Each drone requires the acquisition of data at the sensing location, and the collected data is transmitted back to the user's device side. Each drone can complete data transmission by: The drone first transmits the data to the base station, and then the base station transmits the data to the mobile device. It is assumed that this mode is transmitted in frame units of time. Assume that there are c orthogonal subchannels in the system, and the sub-channels are allocated by the base station. In this paper, an exponential model is used to evaluate the perceived quality of the drone, that is, the successful sensing probability of the drone decreases exponentially with the distance between the drone and the sensing task.
In order to study the path planning of drones, it is necessary to first coordinate the perception and transmission processes of multiple drones. To this end, this paper designs a "perception-transmission" protocol and analyzes it using a nested Markov process.
It is assumed that the sensing and transmission process of the drone will be performed in the form of a series of cycles. The length of a cycle is TC frames. Each cycle contains the sensing part and the transmission part, and their lengths are Ts and Tu, respectively. In each frame of the sensing part, the drone can be judged to be successful in this period only if all Ts frames are successfully perceived. At the beginning of the transmission part, the base station will allocate the current C subchannels. Give C drones with the highest transmission probability. In the transmission part of a cycle, the state of a drone in a certain frame will have the following 4 possibilities: unassigned channels, transmission failure, transmission success, idle. In order to analyze the perception and transmission performance of the drone under the "perceive-servicing" protocol, we will use a nested Markov chain to analyze the protocol. There are two types of perception states for the drone: perceived success Hs and perceived failure Hf. Assuming that the drone's moving direction and speed are constant in a cycle, then according to the start position and end position of the drone in this cycle, you can find the probability of the drone's perceived success in this cycle.
In order to design the trajectory of the drone, the space needs to be discretized. In this article, it is assumed that the space is discretized into a grid model. For the drone that is located at a certain position in the current cycle, there are a maximum of 27 possible positions in the next cycle. Suppose the position of the drone at the t-th period is St. In order to describe the behavior of the drone at a certain position, assume that the set of selectable mobile positions of the drone at time t is A (St). In addition, define the utility of the drone during the t-th cycle the probability that the drone's transmission is effective during the period, denoted by r. This article assumes that the drone chooses to make the trajectory with the maximum utility, and the trajectory design problem of the drone can be written as Assume that each drone is an agent, and everything other than the drone is considered as the environment. At the beginning of any cycle, each drone needs to observe all drone positions at the current moment, and then according to own strategy to determine its own flight trajectory. After each drone takes action, it will get the utility in this cycle, and observe the position of all drones at the next moment.

Conclusion
Although algorithms such as deep imitation learning and deep reinforcement learning have achieved staged results in the field of UAV path planning, they are far from reaching the standards for practical applications. Challenges include: security issues, robustness issues, and scalability issues. For example, how to ensure robustness is an urgent problem, and how to design algorithms that can adapt to different weather and lighting conditions is still a difficult problem. For methods based on deep reinforcement learning, they also face scalability issues. Depth reinforcement learning usually uses simulators to build experimental scenarios and training. However, due to the differences between simulators and real environments, how to transfer the models trained in the simulator to the actual environment needs further research.