Graduation Year

2006

Document Type

Thesis

Degree

M.S.Cp.E.

Degree Granting Department

Computer Science and Engineering

Major Professor

S. Sarkar, Ph.D.

Co-Major Professor

Rangachar Kasturi, Ph.D.

Committee Member

Ravi Sankar, Ph.D.

Keywords

Video Surveillance, Sound Association, Auditory Scene Analysis, Auditory Object, Motion Detection

Abstract

Technological developments and innovations of the first forty years of the digital era have primarily addressed either the audio or the visual senses. Consequently, designers have primarily focused on the audio or the visual aspects of design. In the perspective of video surveillance, the data under consideration has always been visual. However, in light of the new behavioral and physiological studies which established a proof of cross modality in human perception i.e. humans do not process audio and visual stimulus separately, but percieve a scene based on all stimulus available, similar cues are being used to develop a surveillance system which uses both audio and visual data available. Human beings can easily associate a particular sound to an object in the surrounding. Drawing from such studies, we demonstrate a technique by which we can isolate concurrent audio and video events and associate them based on perceptual grouping principles. Associating sound to an object can form apart of larger surveillance system by producing a better description of objects.

We represent audio in the pitch-time domain and use image processing algorithms such as line detection to isolate significant events. These events and are then grouped based on gestalt principles of proximity and similarity which operates in audio. Once auditory events are isolated we can extract their periodicity. In video, we can extract objects by using simple background subtraction. We extract motion and shape periodicities of all the objects by tracking their position or the number of pixels in each frame. By comparing all the periodicities in audio and video using a simple index we can easily associate audio to video. We show results on five scenariosin outdoor settings with different kinds of human activity such as running, walking and other moving objects such as balls and cars.

Share

COinS