You are here

SALSA Dataset

SALSA: Synergetic sociAL Scene Analysis dataset contains uninterrupted recordings of an indoor social event involving 18 subjects over 60 minutes. It serves as a rich and extensive repository for the behavioral analysis and social signal processing communities. In addition to the raw multimodal data, SALSA also contains position, pose and F-formation annotations over the entire event duration for evaluation purposes, as well as information regarding participants’ personality traits.

  • Scenario and roles. SALSA was recorded in a regular indoor space and the captured social event involved 18 participants and consisted of two parts of roughly equal duration. The first part consisted of a poster presentation session, where four research studies were presented by graduate students. A fifth person chaired the poster session. In the second half, all participants were allowed to freely interact over food and beverages during a cocktail party.
  • Sensors. The data were captured by a camera network and wearable badges worn by targets. The camera network comprised four synchronized static RGB cameras (1024×768 resolution) operating at 15 frames per second (fps). Each participant wore a sociometric badge during the recordings which is a 9×6×0.5 cm box equipped with four sensors, namely, a microphone, an infrared (IR) beam and detector, a Bluetooth detector and an accelerometer.
  • Annotations. Using a dedicated multi-view scene annotation tool, the position, head and body orientation of each target was annotated every 45 frames (3 seconds). Annotated positions and head/body orientations were used for deducing F-formations. Prior to data collection, all participants filled the Big Five personality questionnaire. The Big Five questionnaire owes its name to the five traits it assumes as constitutive of personality: Extraversion; Agreeableness; Conscientiousness; Emotional StabilityCreativity

SALSA Dataset is published with the paper

Copyright: SALSA Dataset is published under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. This means that you must attribute the work in the manner specified by the authors (i.e. citing our paper), you may not use this work for commercial purposes and if you alter, transform, or build upon this work, you may distribute the resulting work only under the same license.

Download Part 1, PosterSession

Download Part 2, CocktailParty

Camera Calibration

  • Calibration files for CAM1 CAM2 CAM3 CAM4 generated using checkerboard pattern and OpenCV
  • C++ code to compute the image projection of a 3D point from the calibration files

README

Publications

X. Alameda-Pineda, Y. Yan, E. Ricci, O. Lanz, N. Sebe: Analyzing Free-standing Conversational Groups: A Multimodal Approach. ACM MultiMedia, 2015. (Best Paper Award)

E. Ricci, J. Varadarajan, R. Subramanian, S. Rota Bulò, N. Ahuja, O. Lanz: Uncovering Interactions and Interactors: Joint Estimation of Head, Body Orientation and F-formations from Surveillance Video. International Conference on Computer Vision - ICCV, 2015 (Oral)

X. Alameda-Pineda, J. Staiano, R. Subramanian, L. Batrinca, E. Ricci, B. Lepri, O. Lanz and N. Sebe: SALSA: A Novel Dataset for Multimodal Group Behavior Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015 (arXiv version)