Semantic Multimodal Analysis of Digital Media

Summary:

Huge amounts of digital multimedia content are currently available due to the recent developments in technology. These archives contain different types of data such as image, video, audio and textual information. Extraction of semantic information from this multi-modal data is required for effective and efficient access and retrieval. COST 292 project aims to develop useful multimedia systems that employ annotation and search technologies based on semantic concepts that are natural to the user. Along with these goals, we propose a system that extracts coherent spatio-temporal regions and their low level features including shape, texture, color and motion; performs feature level fusion and associates these multi-modal descriptors for better representation of semantic information; segments the video into semantically meaningful portions, classifies them into semantic categories and mines the data to find interesting patterns; and integrates automatic and semi-automatic learning techniques to improve classification and retrieval performance and satisfy user needs. The proposed system will be integrated with other projects and the use of the developed technology will be evaluated in real world scenarios.

People:

Faculty
- Pınar Duygulu (Co-principal investigator)
- Selim Aksoy (Co-principal investigator)
Graduate Students
- Özge Çavuş (M.S. student)
- Derya Özkan (M.S. student)

Duration:

October 2004 - October 2008

Sponsors:

	TUBITAK - Scientific and Technical Research Council of Turkey (Grant no: 104E077)
	European Commission COST 292 Action

Budget:

102,060 YTL (~US$64,000)

Publications:

Esra Ataer, Pinar Duygulu, "Matching Ottoman Words," in ACM International Conference on Image and Video Retrieval, Amsterdam, The Netherlands, July 9-11, 2007.
Demir Gokalp, Selim Aksoy, "Scene Classification Using Bag-of-Regions Representations," in IEEE International Conference on Computer Vision and Pattern Recognition, Beyond Patches Workshop, Minneapolis, Minnesota, June 23, 2007.
Tolga Can, Pinar Duygulu, "Reklamlarin Belirlenmesi ve Takibi (in Turkish)," in 15. IEEE Sinyal Isleme ve Iletisim Uygulamalari Kurultayi, Eskisehir, Turkey, June 11-13, 2007.
Esra Ataer, Pinar Duygulu, "Osmanlica Kelimeleri Esleme (in Turkish)," in 15. IEEE Sinyal Isleme ve Iletisim Uygulamalari Kurultayi, Eskisehir, Turkey, June 11-13, 2007.
Can Acar, Arda Atlas, Koray Cevik, Isa Olmez, Mustafa Unlu, Derya Ozkan, Pinar Duygulu, "Yuz Bulma Yontemlerinin Haber Videolari icin Sistematik Karsilastirmasi (in Turkish)," in 15. IEEE Sinyal Isleme ve Iletisim Uygulamalari Kurultayi, Eskisehir, Turkey, June 11-13, 2007.
Pinar Duygulu, Muhammet Bastan, Derya Ozkan, "Integrating image and text for semantic labeling of images and videos," in M. Cord, ed., Machine Learning for Multimedia, Springer-Verlag, 2007.
Jia-yu Pan, Hyungjeong Yang, Christos Faloutos, Pinar Duygulu, "Cross-modal Correlation Mining using Graph Algorithms," in X. Zhu, I. Davidson, eds., Knowledge Discovery and Data Mining: Challenges and Realities with Real World Data, Idea Group Reference, 2007.
Derya Ozkan, Pinar Duygulu, "A Graph Based Approach for Naming Faces in News Photos," in IEEE Conference on Computer Vision and Pattern Recognition, New York, NY, June 17-22, 2006.
Ozge Cavus, Selim Aksoy, "Haber Videolarinda Ilgililik Geribeslemesiyle Icerik Tabanli Erisim (in Turkish)," in 14. IEEE Sinyal Isleme ve Iletisim Uygulamalari Kurultayi, Antalya, Turkey, April 17-19, 2006.
Derya Ozkan, Pinar Duygulu, "Yuz ve Isim Iliskisi Kullanarak Haberlerdeki Kisilerin Bulunmasi (in Turkish)," in 14. IEEE Sinyal Isleme ve Iletisim Uygulamalari Kurultayi, Antalya, Turkey, April 17-19, 2006.
Muhammet Bastan, Pinar Duygulu, "Haber Videolarinda Nesne Tanima ve Otomatik Etiketleme (in Turkish)," in 14. IEEE Sinyal Isleme ve Iletisim Uygulamalari Kurultayi, Antalya, Turkey, April 17-19, 2006.
Pinar Duygulu, Muhammet Bastan, "Translating images to words for recognizing objects in large image and video collections," in J. Ponce, M. Hebert, C. Schmid, A. Zisserman, eds., Towards Category-Level Object Recognition, Springer Lecture Notes in Computer Science Series, 2006.
Kobus Barnard, Pinar Duygulu, David Forsyth, "Exploiting Text and Image Feature Co-occurrence Statistics in Large Datasets," in R.C. Veltkamp, H.-P. Kriegel, L. Shapiro, eds., Trends and Advances in Content-Based Image and Video Retrieval, Springer Lecture Notes in Computer Science Series, 2005.
Selim Aksoy, Ozge Cavus, "A Relevance Feedback Technique for Multimodal Retrieval of News Videos," in EUROCON, Belgrade, Serbia & Montenegro, November 21-24, 2005.
Selim Aksoy, Akin Avci, Erman Balcik, Ozge Cavus, Pinar Duygulu, Zeynep Karaman, Pinar Kavak, Cihan Kaynak, Emre Kucukayvaz, Cagdas Ocalan, Pinar Yildiz, "Bilkent University at TRECVID 2005," in TREC Video Retrieval Evaluation (TRECVID), Gaithersburg, MD, November 14-15, 2005.
Giridharan Iyengar, Pinar Duygulu, Shaolei feng, Pavel Ircing, Sanjeev Khudanpur, Dietrich Klakow, Matthew Krause, R. Manmatha, Harriet J. Nock, Desislava Petkova, Brock Pytlik, Paola Virga, "Joint Visual-Text Modeling for Automatic Retrieval of Multimedia Documents," in 13th ACM Multimedia Conference, Singapore, November 6-12, 2005.
Nazli Ikizler, Pinar Duygulu, "Person Search Made Easy," in 4th International Conference on Image and Video Retrieval, Singapore, July 20-22, 2005, also published in Lecture Notes in Computer Science, vol. 3568/2005.
Paola Virga, Pinar Duygulu, "Systematic Evaluation of Machine Translation Methods for Image and Video Annotation," in 4th International Conference on Image and Video Retrieval, Singapore, July 20-22, 2005, also published in Lecture Notes in Computer Science, vol. 3568/2005.
Demir Gokalp, Selim Aksoy, "Finding Faces in News Videos," in 4th International Workshop on Content-Based Multimedia Indexing, Riga, Latvia, June 21-23, 2005.
Nazli Ikizler, Pinar Duygulu, "Haber Videolari icin Yuz Bulma Yontemlerinin Iyilestirilmesi (in Turkish)," in IEEE 13. Sinyal Isleme ve Iletisim Uygulamalari Kurultayi, Kayseri, Turkey, May 16-18, 2005.
Selim Aksoy, Korhan Bircan, Selim Ciraci, Pinar Duygulu, Evren Karaca, Serdar Kasirga, Tarkan Sevilmis, Mustafa Sener, "Bilkent University at TRECVID 2004," in TREC Video Retrieval Evaluation (TRECVID), Gaithersburg, MD, November 15-16, 2004.