Multimedia Data Mining: Integration of Multi-modal Data for a Better Retrieval and Analysis (CAREER Award)

Summary:

Large amounts of multi-modal data is already being stored in many collections. There is a huge amount of information, but it is not possible to access or make use of this information unless it is efficiently organized to extract the semantics. The proposed research will show how semantics can be learned from loosely related multi-modal data. Learning from such a data is important, because it is available in large amounts. On the other hand, it is very hard to obtain the tightly related data which can be only produced by manual labeling. We present a new approach for multi-modal data sets, focusing on image and video collections with associated textual information. Learning the relationships between visual and textual information is an interesting example of multimedia data mining, particularly because it is hard to apply data mining techniques to collections of images and videos. The approach is mainly based on modelling the joint distribution of visual and textual features. First, we will focus on collections where visual features are attached to text. These include image collections with annotated keywords, news videos with speech recognition text, and handwritten Ottoman documents with transcripts. We will propose novel methods for object recognition, face recognition, and hand writing recognition. The main contribution of the approach will be learning large quantities of concepts which is usually limited in traditional computer vision approaches. The proposed system will lead to efficient retrieval and browsing, and some interesting applications including auto-annotation, auto-illustration and auto-documentary. Another focus of the project will be modeling the information extracted from different sensors including optic and infrared cameras and microphones. These data will be used to analyse the movements of patients and elderly people in hospitals and nursing homes. Abrupt changes in movements will be noticed immediately, and a daily summary report will be provided to the doctor.

People:

Duration:

April 2005 - April 2010

Sponsor:

TUBITAK Logo TUBITAK - Scientific and Technical Research Council of Turkey (Grant no: 104E065)

Budget:

157,000 YTL (~US$90,000)

Publications: