Title: "Unsupervised Discovery of Objects and Object Hierarchy in Video: Content Extraction Made Easy"
Abstract: Based on the bag-of-words representation, statistical models have recently become a popular approach to object discovery, i.e., extracting the "object of interest" from a set of images in a completely unsupervised manner. In this talk, we will outline this approach and extend it from still images to motion videos. We will propose a novel spatial-temporal framework that applies statistical models to both appearance modeling and motion modeling. The spatial and temporal models are integrated so that motion ambiguities can be resolved by appearance, and appearance ambiguities can be resolved by motion. In addition, with statistical modeling we can extract hierarchical relationships among objects, completely driven by data without any manual labeling. This framework finds application in video retrieval (e.g., for YouTube or Google Video) and video surveillance.
![]() |
Bio: Tsuhan Chen has been with the Department of Electrical and Computer
Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, since
October 1997, where he is currently Professor and Associate Department Head.
From August 1993 to October 1997, he worked at AT&T Bell Laboratories,
Holmdel, New Jersey. He received the M.S. and Ph.D. degrees in electrical
engineering from the California Institute of Technology, Pasadena,
California, in 1990 and 1993, respectively. He received the B.S. degree in
electrical engineering from the National Taiwan University in 1987. |
Title: "Instant Casting Movie System for Entertainment Revolution"
Abstract: Our research project, Dive into Movie (DIM) aims to build a new genre of interactive entertainment which enables anyone to easily participate in a movie by assuming a role and enjoying an embodied, first-hand theater experience. This is specifically accomplished by replacing the original roles of the precreated traditional movie with user created, high-realism, 3-D CG characters. DIM movie is in some sense a hybrid entertainment form, somewhere between a game and storytelling. We hope that DIM movies might enhance interaction and offer more dramatic presence, engagement, and fun for the audience. Our work on DIM is ongoing, but its initial version, Future Cast System (FCS), is up and running. In the initial version, we focus on creating audiences' highrealism 3-D CG characters with personal facial characteristics, replacing the original characters' faces in the original traditional (background) movie. The FCS system has two key features: First, it can full-automatically create a CG character in a few minutes from capturing the facial feature of a user and generating her/his corresponding CG face, to inserting the CG face into the movie in real-time which do not cause any discomfort to the participant; Second, the FCS system makes it possible for multiple participants to take part in a movie at the same time in different roles, such as a family, a circle of friends, etc. The FCS system is not limited to academic research; 1.6 million people enjoyed a FCS entertainment experience at the Mitsui-Toshiba pavilion at the 2005 World Exposition in Aichi, Japan. I introduce this on going DIM project and review FCS experience in Expo 2005.
![]() |
Bio: Dr. Shigeo Morishima is the professor of Faculty
of Science and Engineering, Waseda University. He received the B.S., M.S.
and Ph.D. degrees, all in Electrical Engineering from the University of
Tokyo, Tokyo, Japan, in 1982, 1984, and 1987, respectively. |
Abstract: Research on media content analysis has made great progress over the years -- from replying on single medium to multimodal analysis, and from focusing only on internal intrinsic content to leveraging external information sources. This talk presents a M3 model to consolidate progress to date into a unified model. The M3 model stands for the use of multimodal, multi-source and multi-resolution approach to media content analysis. Past research reveals that intrinsic content by itself is insufficient to capture most media semantics; hence it is essential to supplement the analysis with external information sources such as the Web, ontologies and thesaurus etc. Moreover, the types of medium and features relevant to information units and the semantics that can be derived from these units depend greatly on the units granularity. In the context of news video, the information granularity can be at the shot, speech discourse and story level, and it is known that text and audio-visual features play different roles at different levels. We apply the M3 model to the domains of news video and arts. In both domains, the M3 model is used to perform auto concept annotation and subsequent indexing and retrieval by leveraging on semantics derived at different information modality, sources and granularity. The talk discusses the design, implementation and deployment of M3 model to these 2 domains.
![]() |
Bio: Dr Tat-Seng Chua is the Professor at the School of Computing, National University of Singapore (NUS). He was the Founding Dean of the School of Computing from 1998-2000. He spent three years as a research staff member at the Institute of Systems Science (now I2R) in late 1980s. Dr Chuas main research interest is in multimedia information processing, in particular, on the extraction, retrieval and question-answering (QA) of video and text information. He focuses on the use of relations between entities and external information & knowledge sources to enhance information processing. His current projects include: precise information retrieval, intelligent local media search, question answering (QA), and information extraction on the web. His group participates regularly in TREC-QA and TRECVID news video retrieval evaluations and has achieved top results. He obtained his PhD from the University of Leeds, UK. Dr Chua is active in the international research community. He has organized and served as program committee member of numerous international conferences, and editorial board member of several journals, including, The Visual Computer, Multimedia Tool & Applications, and ACM Transaction of Information Systems. He is the Conference Co-Chair of CIVR'2005, ACM Multimedia 2005, and ACM SIGIR 2008. |