In simple terms, visual tracking is the process of visually following a given object. This very initial task is one of the fundamental problems in many computer vision applications, like in movement pattern analysis, animal surveillance, robot navigation, and so on. Currently, with the increasing popularity of cameras, large video data are generated every day. However, the algorithms that used to handle this visual information are far from being enough. Besides, the technical progress and increasing demands for unmanned aerial vehicle (UAV) with the automatic pilot capability promote the requirement of visual tracking in many practical applications. Therefore, visual tracking is still one of the most interesting topics in computer vision tasks.
However, even various methods have been proposed for visual tracking, it is still an open problem due to its complexity. Tracking means following the target’s motion, and the primary challenge in visual tracking is the inconsistency of target’s appearance. The status of the target may be changed with the illumination change, the non-rigid motion, and occlusions. Additionally, the similar background may cause the drift problem like switching the targeted player to the untargeted ones in the games. In real life, there are more problems such as the scale change, fast motion, lower resolution and out of plane rotation which cause the tracking tasks even more challenging. Therefore, visual tracking, after several decades’ research, is still an active research topic with many unsolved problems.
In this dissertation, three tracking methods are proposed trying to deal with the tracking problems for different targets and various scenarios. Also, besides the tracking in 2D images, this work further introduces a 3D space tracking model in augmented reality application.
For a simple tracking scenario, an efficient tracking method with distinctive color and silhouette is proposed. The proposed method uses colors that most exist on the target to represent and track it. It is a dynamic color representation for the target which is updating with the background changes. This appearance model can substantially reduce the distractors in the background, and the color is constant to the shape change which significantly alleviates the nonrigid deformation problem.
Based on the above tracking idea, a unique feature vote tracking algorithm is further developed. This work divides the feature space into many small spaces as storage cells for feature descriptions. And if most of the descriptions in the cell are from the target, the features in the cell are treated as unique features. Besides counting how likely the feature from the target, each feature’s location respect to the target center is recorded to reproject the center in the new coming frames. This voting machine makes the tracker focus on the target against the occlusion and cluster background.
Recently, deep learning and neural network show powerful ability in computer vision applications. The neural network, especially the convolutional neural network has been successfully used in object recognition, detection, and tracking. This work uses a pre-trained network that has learned high-level semantic features to represent the target as a concept model. This concept is a combination of these high-level features that learned from myriads of objects. With the concept, the network can generate a hot map of the new coming frame that shows the possible distribution of the target. Finally, a Siamese network is used to locate the target location. The high-level semantic features are robust to general appearance changes and can retrieve the target in many complicated scenarios.
Besides 2D tracking, 3D space tracking is more useful in many applications. To demonstrate that, this work uses a stereo camera to form a space tracking system for the surgical scalpel. Three LED lights are attached to the top of the scalpel to help the tracking of its tip. After the registration between cameras, operation table, and the augmented reality device, the scalpel’s motion can be displayed in the space by the augmented reality device. This holographic display is advantageous in the surgical operation for education and navigation. The localization experiment also shows the accuracy of the 3D space tracking.
In summary, these research efforts have offered different methods for visual tracking at various scenarios. It also demonstrates the visual tracking’s usefulness in practical application that both in 2D and 3D space.