Binary Image Compression using Conditional Entropy-based Dictionary Design and Indexing
- Yandong Guo ,
- Dejan Depalov ,
- Peter Bauer ,
- Brent Bradburn ,
- Jan P. Allebach ,
- Charles A. Bouman
Proc. SPIE 8652, Color Imaging XVIII: Displaying, Processing, Hardcopy, and Applications, 865208 |
The JBIG2 standard is widely used for binary document image compression primarily because it achieves much higher compression ratios than conventional facsimile encoding standards, such as T.4, T.6, and T.82 (JBIG1). A typical JBIG2 encoder works by first separating the document into connected components, or symbols. Next it creates a dictionary by encoding a subset of symbols from the image, and finally it encodes all the remaining symbols using the dictionary entries as a reference.
In this paper, we propose a novel method for measuring the distance between symbols based on a conditional entropy estimation (CEE) distance measure. The CEE distance measure is used to both index entries of the dictionary and construct the dictionary. The advantage of the CEE distance measure, as compared to conventional measures of symbol similarity, is that the CEE provides a much more accurate estimate of the number of bits required to encode a symbol. In experiments on a variety of documents, we demonstrate that the incorporation of the CEE distance measure results in approximately a 14% reduction in the overall bitrate of the JBIG2 encoded bitstream as compared to the best conventional dissimilarity measures.