Person re-identification in camera networks consists in matching observations of individuals across disjoint views in a network of surveillance cameras. This task is sometimes also referred to as multi-camera single person tracking.
The appearance of individuals varies greatly through the scenes, due to possibly different acquisition devices and ambient illumination, changes in viewpoints, illumination conditions, shadows, occlusions, different pose/orientation of the person that has to be searched for, as well as the presence of other similar individuals that populate the scenes.
Re-identification methods can be roughly divided into single-shot and multiple-shot approaches. The former have only one occurrence of the individual to be searched, while the latter integrate information over time using multiple views of the subject tracked in the video-stream upon the first indication as suspect given by an operator or by an intelligent module. The features to describe the suspect (e.g appearance) are used to build a "signature" of the person. Then the frames of the video streams captured by other surveillance cameras are analyzed, possibly only in restricted regions, generating local signatures as well. Signatures are compared with the descriptor of the suspect and if they are similar likely locations are suggested. As a matter of fact, re-identification is an object detection task given one or few examples. The main challenge are: which features to consider and how to define the similarity between them.
In this project TeV developed a new single-shot re-identification module to be integrated in the video surveillance system able to propose hypothesis of re-identifications. It is a supervised method to compute a scoring function that, when applied to a pair of images, provides a score expressing the likelihood that they depict the same individual. The method is characterized by:
(i) the usage of a set of local image descriptors based on Fisher Vectors,
(ii) the training of a pool of scoring functions based on the local descriptors, and
(iii) the construction of a strong scoring function by means of an adaptive boosting procedure.
This technology can be useful in various scenarios, such as in monitoring systems in a care home for impaired or elderly people in order to follow patients that pass from a room to another, or to track clients in smart shopping applications.