The paper addresses the problem of of aligning visual and auditory data using a sensor that is composed of a camera-pair and a microphone-pair. The original contribution of the paper is a method for audio-visual data aligning through estimation of the 3D positions of the microphones in the visual centred coordinate frame defined by the stereo camera-pair.
The paper can be downloaded here: Alignment of Binocular-Binaural Data Using a Moving Audio-Visual Target.
]]>In order to collect and aggregate the relevant cues and for building and maintaining hypotheses about the group state and composition, Bielefeld University developed a new component in the HUMAVIPS project called the GroupManager. This component receives results from several perception components (such as face detection/tracking, visual focus estimation or face classification) as input cues, aggregates them (e.g. by generating sliding windows of historical information or combining several cues) and calculates several derived measures based on this aggregated data. This serves as a stabilization and layer of abstraction necessary for higher-level components making decisions on how the robot should adapt its behavior.
This video shows some highlights from an interactive demonstration incorporating the results from the GroupManager component.
]]>
This paper addresses the task of mining typical behavioral patterns from small group face-to-face interactions and linking them to social-psychological group variables. The paper can be downloaded here: Linking Speaking and Looking Behavior Patterns with Group Composition, Perception, and Performance
]]>Journal on Multimodal User Interfaces, Special Issue on Multimodal Corpora, published online Aug. 2012
Emergent leaders through looking and speaking: from audio-visual data to multimodal recognition
]]>Proc. of the CogSys conference 2012
Robot-to-Group Interaction in a Vernissage: Architecture and Dataset for Multi-Party Dialog
]]>IROS workshop on Human Behavior Understanding, . . . → Read More: Recognizing the Visual Focus of Attention for Human Robot Interaction]]>
IROS workshop on Human Behavior Understanding, Vilamoura 2012
Recognizing the Visual Focus of Attention for Human Robot Interaction
]]>Outstanding Paper Award, Proceedings of the 14th ACM International Conference on
Multimodal interaction, Santa Monica, USA
Linking speaking and looking behavior patterns with group composition, perception, and performance
]]>A Track Creation and Deletion Framework for Long-Term Online . . . → Read More: A Track Creation and Deletion Framework for Long-Term Online Multi-Face Tracking]]>
A Track Creation and Deletion Framework for Long-Term Online Multi-Face Tracking
IEEE Transaction on Image Processing, March 2013
]]>CVPR Workshop on Face and Gesture and Kinect demonstration . . . → Read More: Gaze estimation from multimodal Kinect data]]>
CVPR Workshop on Face and Gesture and Kinect demonstration competition, Providence, USA, 2012
]]>Given that, Should I Respond? Contextual Addressee Estimation in . . . → Read More: Given that, Should I Respond? Contextual Addressee Estimation in Multi-Party Human-Robot Interactions]]>
Human Robot Interaction (HRI) Conference, Tokyo.
]]>