Automatic caption localization in compressed video

A.K Jain; Zhang Hongjiang; Zhong Yu

Automatic caption localization in compressed video

A.K Jain ,
Zhang Hongjiang ,
Zhong Yu

April 2000

Published by Institute of Electrical and Electronics Engineers, Inc.

Publication

Download BibTex

We present a method to automatically localize captions in JPEG compressed images and the I-frames of MPEG compressed videos. Caption text regions are segmented from background images using their distinguishing texture characteristics. Unlike previously published methods which fully decompress the video sequence before extracting the text regions, this method locates candidate caption text regions directly in the DCT compressed domain using the intensity variation information encoded in the DCT domain. Therefore, only a very small amount of decoding is required. The proposed algorithm takes about 0:006 second to process a 240 350 image and achieves a recall rate of 99:17 percent while falsely accepting about 1:87 percent nontext DCT blocks on a variety of MPEG compressed videos containing more than 2; 300 I-frames.

© 2000 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.