Video Indexing and Understanding
We have been developing Name-It, a system that associates faces and names
in news videos. The system is given news videos, which include image sequences
and transcripts obtained from audio tracks or closed caption texts. The
system can then either infer the name of a given face and output the name
candidates, or locate a face in news videos by name. To accomplish this
task, the system takes a multi-modal video analysis approach:
Each method includes several advanced image and natural language processing
techniques: face tracking, face identification, intelligent name extraction
using dictionary, thesaurus, and parser, text region detection, image enhancement,
character recognition, and the integration of these techniques. The success
of our experiments demonstrates the benefits of the multi-modal approach
for video analysis.
face sequence extraction/identification from videos,
name extraction from transcripts, and
video caption recognition.
Given: 5 hr. CNN Headline News
Figure 1. Name-to-Face Retrieval (given "CLINTON")
Figure 2. Face-to-Name Retrieval (given the face of Warren Christopher)
Name-It: Naming and Detecting Faces in News Videos,
Shin'ichi Satoh, Yuichi Nakamura, and Takeo Kanade,
IEEE MultiMedia, Vol. 6, No. 1, January-March, pp. 22-35, 1999.
Name-It: Association of Face and Name in Video
Shin'ichi Satoh and Takeo Kanade,
Proc. of CVPR'97,
pp. 368-373, 1997.
(longer version: School of Computer Science, Carnegie Mellon University,
Name-It: Naming and Detecting Faces in Video
by the Integration of Image and Natural Language Processing,
Shin'ichi Satoh, Yuichi
Nakamura and Takeo Kanade,
Proc. of IJCAI-97, pp.
Last modified: Wed Mar 17 1999.