Research on the Voiceprint Recognition Based on Bp-Ga Algorithm

In order to improve the performance of voiceprint recognition system, the paper proposes to use BP neural network-genetic algorithm (BP-GA) in voice recognition. The algorithm can overcome the problem that the traditional multi-layer artificial neural network is easy to fall into local minima when it is trained by genetic algorithm. The experimental results show that the BP-GA algorithm has the advantages of faster recognition rate, higher recognition rate and lower error rate, automatic error correction and robustness for different speakers, than the traditional recognition algorithms (LPCC, MFCC, etc.).


Introduction
With the rapid developing of network technology, information security is becoming more and more important. Conventional password authentication has revealed some shortcomings in using of information network. However, the technology of biometrics has become increasingly mature and has shown its superiority in practical application. Among them, voiceprint recognition is a new recognition technology developed in recent years. Compared with other biometrics, voiceprint recognition has many advantages such as simple, accurate, economical and non-contact identification.
BP-GA algorithm is proposed to identify voiceprint in the paper. Compared with the traditional recognition algorithm, it has the advantages of fast recognition, high recognition rate, low error rate, automatic error correction and robustness for different speakers.

Vector processing for voice wavelet transform
As BP-GA recognition algorithm identifies digital variables, the voice signal must be digitally vector-preprocessed. Wavelet analysis is a new concept of time-frequency analysis method, with a variable resolution, reflecting non-stationary transient changes accurately, anti-noise interference ability and other advantages. It can better reflect the dynamic information of the voice signal, and has the advantages of simple implementation and low computational complexity. And can not only fully reflects the auditory characteristics of the human ear but also accurately reflects the dynamic characteristics of the voice signal, so as to improve the final recognition rate of the voiceprint.
For any one-dimensional voice signal, noisecontinuous wavelet transform (CWT) is defined as the inner product of signal and wavelet basis function, that is, Similar to the short-time Fourier transform, the original signal can be recovered from the wavelet transform of the known voice signal, which is called wavelet inverse transform reconstruction. The inversion formula can be expressed as: It is assumed that a one-dimensional voice signal containing noises can be expressed as follows: Where f(i) is the true voice ramp signal, e(i) is a Gaussian white noise or other noise signal, s(i) contains the noise signal and the useful low frequency signal to be extracted. The purpose of signal extraction is to extract the useful from the signal with noise so as to recover the true slowly changing signal f(i) in s(i). In practical engineering, the voice signal usually presents as a relatively stable signal, while the noise signal usually appears as a higher frequency signal. Therefore, we choose a wavelet basis and determine a wavelet decomposition level, and then decompose the signal s into N levels. Considering the actual conditions and the amount of computation, we use n = 3. As shown in Figure 1.

Bp-Ga Algorithm
BP-GA algorithm is a new identification method in recent years, with high efficiency, strong recognition ability [3]. The genetic algorithm neither depend on the gradient information, nor require the objective function to be continuous, and may not even need the expression of the objective function. As the combination of artificial neural network and genetic algorithm not only solves the problem of low efficiency and long time consuming in BP neural network, but also plays a global solver of genetic algorithm, so it is an effective and feasible method for identifying. The specific realization of the process shown in Figure 2.

BP neural network construction for voiceprint recognition
BP multi-layer forward-forward neural networks divide the network into several layers, and the layers are arranged in sequence. The neurons in layer i only receive the signal given by the neurons in layer (i-1). Neurons in each layer have no feedback. When a vector x is input to a forward network, a vector y is output after passing through the network. Therefore, the forward neural network can be regarded as an inverter that completes the mapping from x to y. In specific applications, this paper uses a 3-layer BP network. The first layer of the input vector should be adjusted according to the actual situation. The first layer is the normalization layer, the input vector is [Q1, Q2, Q3, Q4]. The second layer is the BP network input layer, corresponding to take five nodes. The third layer is the output layer, take a node, the output characteristic function take s-type function, the output value is the credibility of the voiceprint identification, where the continuous number (0,1) interval. 1 ij ω is the input layer weight(i=1, 2, 3; j=1, 2, 3), The input layer node is

Design for Genetic Algorithms
The genetic algorithm includes five basic elements: coding, initial population, fitness function, genetic operation, parameters control and termination rules. Coding is a bridge connecting problems and algorithms. In order to facilitate genetic search in a large space and improve the accuracy of the algorithm, floating-point encoding method are taken. Each individual in the initial population consists of a 40-bit binary string, each of which is generated as follows: A random number between (0, 1) is generated, and if this number is greater than 0.5, the bit code is 1, otherwise 0. The current method is based on the conventional penalty function idea in the optimization method. Since the paper is the error minimum optimization problem, the fitness function can be expressed as following. Genetic manipulation includes selection, crossover, mutation, and group updating in 4 parts.
(2) Crossover: Choose a uniform crossover method, with mask samples generated in a random manner.
(3) Variation: Random selection of mutation bits, to flip the bit value.
In this paper, we use two termination rules. When the difference between the maximum value and the minimum value of the objective function in the group is less than the given accuracy of 1e-4, the algorithm is considered to have converged to terminate the program. Otherwise, set the maximum generation to 100, and terminate the program when the number of iterations reaches the value.

Figures/Captions
In this case, the comparative experiments are been done by using this recognition algorithm and BP neural network respectively, with the eigenvectors of 100 speaker's voice in the voice database. The object output is 1, the counterfeiter is 0, while training with the identified criterion of 0.5. In the genetic algorithm program In MATLAB 7.0 environment, the initial group of genetic algorithm is 100, cross probability is 0.51, and mutation probability is 0.032. In the BP neural the tansig transfer function is used by hidden layer, the satlin transfer function is used by output layer, a single output type is used as output in order to improve training speed. The general computer configuration is: CPU Pentium M, clocked at 1. 4 GHz, memory 512 M, Windows XP.

Figure 3 BP neural network training results
Calculated by MATLAB neural network tools, the final relative recognition error rate for the trained BP neural network is about 1.6%, as shown in Fig.3. The specific test results are shown in Table 1. Select the traditional LPCC recognition algorithm as a contrast, the experimental results of specific tests shown in Table 1.

Conclusion
BP neural network has some intelligence, can gain experience from errors, and improve recognition performance. Genetic algorithm can improve the learning speed, convergence rate of BP neural network, and shorten the time spent on training. Compared with the traditional recognition algorithm (LPCC, MFCC, etc.), the BP-GA algorithm has the advantages of recognition speed, high recognition rate, low error rate, automatic error correction and robustness to different speakers.