Semantic Multimodal Analysis of Digital Media


Huge amounts of digital multimedia content are currently available due to the recent developments in technology. These archives contain different types of data such as image, video, audio and textual information. Extraction of semantic information from this multi-modal data is required for effective and efficient access and retrieval. COST 292 project aims to develop useful multimedia systems that employ annotation and search technologies based on semantic concepts that are natural to the user. Along with these goals, we propose a system that extracts coherent spatio-temporal regions and their low level features including shape, texture, color and motion; performs feature level fusion and associates these multi-modal descriptors for better representation of semantic information; segments the video into semantically meaningful portions, classifies them into semantic categories and mines the data to find interesting patterns; and integrates automatic and semi-automatic learning techniques to improve classification and retrieval performance and satisfy user needs. The proposed system will be integrated with other projects and the use of the developed technology will be evaluated in real world scenarios.



October 2004 - October 2008


TUBITAK Logo TUBITAK - Scientific and Technical Research Council of Turkey (Grant no: 104E077)
COST 292 Logo European Commission COST 292 Action


102,060 YTL (~US$64,000)