You are here

Multi-camera people tracking datasets

We provide several datasets for people tracking in multi-camera environment.

  • SALSA dataset contains uninterrupted multimodal recordings of an indoor social event involving 18 subjects over 60 minutes. The camera network used to acquire the video data comprised four synchronized static RGB cameras (1024×768 resolution) operating at 15 frames per second (fps).
    Details and download: SALSA dataset
  • HALLWAY dataset contains recordings taken with four cameras in a large open space equipped with chairs, desks and poster walls. The cameras are distant about 15 meters one from the other, where people’s bodies may appear in some views at resolutions as low as 15×60 pixels. The sequence is about 5 minutes long with people entering the area, moving, standing still, forming into groups and interacting with each other, sitting in chairs and leaving the area. The maximum number of people in the scene at the same time is 9.
    The cameras are frame-synchronized, frame rate for the sequence is 15, image resolution is 800×600 (jpeg format). Ground-truth files (ground positions of people) obtained by manual labelling and camera calibration together with c++ code for using it are also provided.
    Details and download: HALLWAY dataset
  • VIPT dataset contains two recordings with four cameras installed in the corners of a lab environment of dimensions 5×6 meters. Up to four subjects move around in an unevenly illuminated environment. Sequence sharp-illumination-edge (77 secs) is characterized by sharp illumination discontinuity rendered to the environment by a directional light source (Balcar Fluxlite illuminator). In sequence changing-illumination (193 secs) these illumination conditions change over time.
    The cameras are frame-synchronized, the frame rate of the sequences is 15, image resolution is 1024x768 (jpeg format). Camera calibration together with c++ code for using it is provided. For sharp-illumination-edge ground-truth files (ground positions of people) obtained by manual labelling is provided.
    Details and download:
    VIPT dataset
  • LAB dataset contains recordings taken with four cameras installed in the corners of a lab environment of dimension 5×6 meters. The sequence is about 3.5 minutes long with people entering the lab, walking around, sitting and leaving the area randomly. The maximum number of people in the scene at the same time is 7.
    The cameras are frame-synchronized, frame rate for the sequence is 15, image resolution is 640×480 (jpeg format). Ground-truth files (ground positions of people) obtained by manual labelling and camera calibration together with c++ code for using it are also provided.
    Details and download: LAB dataset
  • MVPDT dataset contains two recordings with four cameras installed in the corners of a lab environment of dimensions 5×6 meters. Seq1 is about 3 and a half minutes in length, with only one person moving in the room. Seq2 is about 7 minutes, with up to three persons moving in the room.
    The cameras are frame-synchronized, frame rate for the sequence is 15, image resolution is 1024x768 (jpeg format). Ground-truth files (ground positions of people) obtained by manual labelling and camera calibration together with c++ code for using it are also provided.
    Details and download: MVPDT dataset
Technology type: 
Research topics: