An Improved Meanshift Tracking Algorithm Using Adaptive Quantization Step in Color Space

The traditional meanshift based tracking algorithm uses a constant quantization step to carry out feature generation in the color space but it cannot dynamically alter the quantization step with the changes of the target geometry to improve computational efficiency in large depth-of-field scenarios. Based on the traditional meanshift algorithm, we proposed a tracking algorithm using adaptive quantization step which automatically adjusts the quantization step of the color histogram and uses the dynamic time warping algorithm to match the features with different dimensions when the target geometry changes, thereby, effectively reducing the average frame processing time. The comparative experiments under multiple scenarios demonstrated that the proposed algorithm can adaptively adjust the quantization step of color histogram in large depth of field scenarios and improve the operating efficiency of the algorithm.


I. INTRODUCTION
As an integral part of the wildly used intelligent video systems [1][2][3][4][5], target tracking is the basis for various high-level intelligent analysis and processes in intelligent video systems.Basically, the tracking of moving target refers to the scene where the effective features of the target and appropriate matching algorithms are employed to find the regions that are most similar to the original target in the image sequence.In practical applications, not only can the tracking algorithm provide the target's trajectory and accurate position information, but also provide effective information for behavior understanding and decision-making through the analysis of target's moving speed and direction.Nam H and Han B proposed a novel visual tracking algorithm based on the representations from a discriminatively trained Convolutional Neural Network [6].However, limited by the computational capabilities, few systems are able to continuously perform more advanced processions such as real-time target recognition and behavior analysis after completing the multi-channel tracking.Therefore, how to reduce the time complexity of the tracking algorithm in practice has become the focused issue all this time.
The prototypical meanshift algorithm was originally used by Fukunaga et al. [7] in cluster analysis and soon afterwards in image processing by Cheng et al. [8] and then applied to the tasks of image segmentation and target tracking by Comaniciu and Meer et al. [9][10].The essence of the meanshift tracking algorithm is to find the local extrema of the probability density function by quick iteration in the direction of the gradient ascent to generate the mean shift vector and determine the position of the target in the current frame by model matching.To reduce the computational complexity of the algorithm in practical application, a method of restricted searching area is usually used so that the current search is performed only in the vicinity of the previous target position.
This paper tries to improve the operating efficiency of meanshift based tracking algorithm from another perspective.In the proposed algorithm, the quantization step of the color histogram will be automatically adjusted when the target size drastically changes in large depth-of-field scenarios.Meanwhile, dynamic time warping (DTW) algorithm is introduced to match the features of different dimensions.The use of dynamic quantization steps enables the improved algorithm to extract target information in a more efficient way and reduce the computational load.The rest of this paper is organized as follows: in Section 2, an improved meanshift algorithm with adaptive quantization steps is given and DTW is introduced to match the feature vectors with different dimension.The experimental results in different scenarios are presented and discussed in Section 3, which demonstrate the efficiency improvement due to the use of adaptive quantization steps.Section 4 concludes the paper.

II. ADAPTIVE QUANTIZATION STEPS MEANSHIFT ALGORITHM IN COLOR SPACE FOR TARGET TRACKING
The meanshift algorithm is a non-parametric method that describes the distribution of pixel values based on a specific kernel function and then iteratively searches the local extrema of the kernel function.The color histogram is usually employed to describe the target, and Bhattacharyya coefficients are used to measure the similarity between the target model and the candidate one.For each frame of the video, the color histogram of the target area is calculated using selected kernel function and the area with the greatest similarity to the target model in the candidate area of the next frame is the position of the target in the next frame.
The traditional meanshift tracking algorithm uses constant quantization steps of color histogram for the generation of both target and candidate models, which ignores the dynamic change of color information due to changes in the target spatial position.Fig. 1 shows the demonstration of positional relationships and video images in the scenarios where the person is moving along the depth of field.It can be clearly seen that the moving object's discriminability and the occupied image's area are greatly reduced in position A as compared to that in the position B and C, which inspires us to consider whether the targets can be expressed in a simpler way.
The key of the proposed algorithm is to use dynamic quantization step in the generation of the models.The initialization process of the algorithm is the same as that of traditional meanshift tracking algorithm.After initialization, the candidate model is firstly extracted in the candidate area, a new target position is determined through meanshift iteration and the tracking area is updated.Note that the focus of this paper is to dynamically adjust the quantization step, the target-size adaptation method is simply introduced from [11].The quantization step will be adjusted once the size of the target changes drastically ' 1 1 (1 ) (1 ) where C and C' are weights for updating the size of the tracking window and the quantization step; k is the current quantization step.s is the scale parameter to control the linear adjustment of the target size.When the target size zooms out or zooms in to a certain extent, the quantization steps of color histograms are adjusted.Eq. ( 1) indicates that if the target scale parameter become C times of the original value, the current quantization step will be adjusted by C' whose specific values depends on the scene-depth and the actual geometry of the tracked object.The height and width of the target can be adjusted using the same C or with different weights.If the computation cost is expected to be reduced to the maximum extent, the quantization step will be reduced in an exponential manner The usage of dynamic quantization step leads to a problem that we need to match the feature vectors with different dimensions in the previous and current frames.In order to match the features with different dimensions, the DTW algorithm [12] has been introduced.DTW is a common used model matching method, it maps the to-be-matched feature T nonlinearly to the same scale as the reference feature R through a specific function.Assume that the length of the feature to be matched is N, the length of the reference feature is M, the dimensions of the feature to be matched and the reference feature are respectively marked on the horizontal axis and the vertical axis of the coordinate system, thereby, forming a grid in the coordinate system.Any intersection point (xi, yj) in such grid stands for the intersection of the two features, and the cost function of the intersection point is D[i, j], (i=1…N, j=1…M).Assume that all the lattice points that the path passes by are (x1, y1),…, (xi, yi),…, (xN, yM), then we have: Experimental results With the help of the local cost function d(xi, yj), and considering the three possible points D(i,j-1),D(i-1,j),D(i-1,j-1) before the current point, the path with the smallest cumulative cost function can be found by backtracking from the point (xN, yM) to the point (1,1), which represents the distance between the two sequences.

III. EXPERIMENTAL RESULTS
In order to verify the effectiveness of the proposed method, the tracking experiments were conducted in the environment of VS2010.The hardware environment of the computer platform was Intel quad-core 3.1GHz CPU, 8G memory, 1TB hard disk and the operating system was WIN7.Experimental results were compared with those from the traditional meanshift in different environments.The quantization step adaptively changing of the proposed algorithm and the difference of the operating efficiency between the proposed algorithm and the traditional algorithm were mainly investigated.The frame number, target sizes and quantization steps were given below each resulting image.

Fig. 2. Comparison of Tracking results in experiment 1
A video clip recorded on a campus road was used in experiment 1 where a red vehicle moved from the near to the distant.Fig. 2 (A1)-(A4) correspond to the tracking results based on the traditional meanshift algorithm, while Fig. 2 (B1)-(B4) correspond to the results of the proposed algorithm.The moving target is located in the yellow frame.It can be clearly seen that the traditional meanshift algorithm cannot adaptively change the tracking window as the target size changes.As the target moves deeper along the depth of field, the traditional meanshift algorithm's ability of depicting the target with histogram features is constantly weakened, thereby, providing only a very coarse target position.From the results of the proposed algorithm, it can be seen that the tracking window changes adaptively with the target, and the quantization step decreases with the decrease of the target size, leading to a more accurate tracking result.
A clip of sport video was used in experiment 2 where there was a skier who quickly swept the camera.From Fig. 3(A1)-(A4), it can be clearly seen that due to the complex background, the traditional algorithm loses the ability to depict the target when the target becomes smaller, resulting in the loss of the target.However the proposed algorithm can timely update the target size and adjust the quantization steps in time to maintain the validity of each frame of candidate model, thereby, achieving more accurate tracking even in the complex background of rapid movement.In addition to the visual tracking results, the efficiency improvement of the proposed algorithm was also examined.Fig. 4 shows the changes of the time consumptions for frame processing caused by the automatic adjustment of the quantization steps in experiment 1 and experiment 2 (the horizontal axis stands for the frame number, same hereinafter).It can be seen that the time consumption for frame processing is reduced with the automatic reduction of the quantization steps.Fig. 5 shows the comparison of average time-consumption of traditional meanshift and the proposed method in experiment 1 and experiment 2. As far as the operating efficiency is concerned, the improved algorithm is basically consistent with the traditional algorithm when the quantization step is not adaptively reduced, but when the quantization step is automatically reduced, the time consumption has significantly been reduced.The moving targets in experiments 1 and 2 moved from the near to the distant along the depth of field.In order to further verify the validity of our algorithm, a video clip with object moving from the distant to the near was used.The results were shown in Fig. 6.In the tracking results of the traditional meanshift shown in Fig. 6 (A1)-(A4), the tracking algorithm focus on the local part of the vehicle due to the fact that the size of the target in the initial state is small and hence cannot be adaptively adjusted In the tracking results of the proposed algorithm shown in Fig. 6 (B1)-(B4), the algorithm can adaptively change the quantization step and target size, leading to better tracking result.The comparison of the quantization steps of the histogram and the changes of average time-consumption during the algorithm's running was given in Fig. 7.It can be clearly seen from Fig. 7, concerning the size of the target is very small in the initial state, the time consumption with quantization step of 32 in the proposed algorithm is obviously lower than that for the quantization step of 256 in the traditional algorithm.As the target moves from distant to near, the quantization step adopted in the proposed algorithm is increased as the target size increases, thereby, leading to the increase of the time consumption.When the target moves closer, the quantization step reaches 256, which is consistent with that in the traditional algorithm, and the time-consumption of the proposed algorithm tends to be consistent with that of the traditional algorithm.The comparisons of time-consumptions for the average frame processing of traditional meanshift and the proposed algorithm were depicted in Fig. 8, showing that the operating efficiency of the improved algorithm had been improved by about 10%.

Fig. 1 .
Fig. 1.Demonstration of the spatial position and video image of the target moving along the depth of field.

Fig. 4 .
Fig. 4. Comparison of changes of quantization steps and average time-consumptions in experiment 1 and experiment 2. (A) Result of experiment 1; (B) Result of experiment 2.

Fig. 7 .Fig. 8 .
Fig. 7. Comparisons of quantization steps and relative time consumptions in experiment 3.IV.CONCLUSIONThe target tracking algorithm based on traditional meanshift cannot dynamically adjust the quantization steps as the target size changes.An improved algorithm based on the traditional meanshift has been proposed in this work.Once the target size changes, the proposed algorithm can adaptively adjust the quantization step and use DTW to carry out model matching in order to increase the operating efficiency and computational flexibility of the tracking algorithm.The idea of the proposed algorithm can be easily understood and the improvement is not complicated, enabling the application feasibility.The experimental results