Squeezing more life out of bitonal files
- a study of black and white - Parts 1 and 2 and now, 3
Posted May 1, 2003
Updated June 17, 2003
Don't miss this excellent new study in RLG DigiNews!
Includes coverage of DjVu and even mentions PlanetDjVu!
Here are some of the comments about DjVu extracted from the new Part 3, published June 15, 2003:
DjVu: a file format supporting several compression schemes for bitonal, gray level, and color images. It supports both lossy and lossless bitonal compression. The bitonal compression algorithm is called JB2 and is similar to JBIG2.
We fully tested Any2DjVu , a Web service that allows files of many different formats (including TIFF G4) to be uploaded and converted to the DjVu format.
We also tested cjb2, a bitonal DjVu encoder that is part of the DjVuLibre package, an open source implementation of DjVu. Cjb2 only converts single pages. Although DjVuLibre comes with a utility (called djvm) that combines single DjVu pages into multipage DjVu files, it does not support font learning across pages. Thus we tested cjb2 only for the encoding of single pages.
There is also a commercial DjVu encoder, made by the format's owner, LizardTech, Inc., which we did not test. Currently available as part of LizardTech's Document Express 4.0, it is available in a trial version from LizardTech's Web site. The trial became available fairly late in our testing cycle and requires a special page cartridge (allowing the encoding of 250 pages) which we requested, but still had not received ten days later.
Note from PlanetDjVu: LizardTech's lack of response is typical. It appears that the company does not respond to anyone anymore about DjVu, a problem that apparently started in April of this year, following the firing of LizardTech CEO Bill Patterson. See our previous article.
DjVu: Any2DjVu is a great service for experimenting with the DjVu format. It's free, available from any Web browser and extremely flexible in the range of inputs it handles. Unfortunately, the results from it are merely suggestive of what DjVu is capable of, since the product behind the service isn't available for purchase.
Cjb2 is available for free, but has obvious limitations. It's best use would be to produce lossless DjVu versions of single page, text-only scans. Beyond that, its lossy compression is unexceptional and its cleaning routine damages halftone images.
We would like to have done at least some testing of LizardTech's Document Express 4.0, but, as mentioned earlier, we were unable to obtain the evaluation cartridge necessary to unlock the encoder. Nevertheless, we would encourage those interested in DjVu to download their own copy and try it out.
Note from PlanetDjVu: Consider downloading DjVu Solo 3.1 instead of Document Express 4.0 from LizardTech, in order to try out DjVu. By doing so, you will not be stuck without an evaluation cartridge, and the DjVu files are actually better than those produced with Document Express 4.0. The files producted are slightly smaller than those produced in Document Express 4.0, and are written as DjVu Version 21 files, so no upgrade notices will pop-up for users who have versions of the DjVu viewer prior to 4.0. Additionally, the Version 24 DjVu files written by Document Express 4.0 are not fully-compatable with DjVuLibre, another reason to use DjVu Solo 3.1. The interface for creating DjVu files remains unchanged between Document Express 4.0 Editor and DjVu Solo 3.1.
Here is the DjVu section extracted from Part 2:
Overview. DjVu (pronounced like déjà vu) was developed by AT&T Labs in 1996 with the first publicly released products coming in 1998. DjVu is designed to be a comprehensive, all-in-one document solution, suitable for bitonal text as well as gray scale and color content. DjVu defines a document format and encompasses several different compression schemes. A layering scheme allows documents that combine text and continuous tone content to treat each separately for optimal compression and display. AT&T Labs sold the rights to DjVu to LizardTech, Inc. in 2000. The independent PlanetDjVu Web site is an excellent source of information on all things DjVu.
Advantages. Lossy (claimed visually lossless) and true lossless compression of bitonal images, both claiming considerably better compression than G4. Also handles gray scale and color. Viewer is available for all major platforms. Handles single- and multi-page documents. In December 2001 LizardTech released partial open source of the v3.0 DjVu Reference Library, and others further enhanced that library.
Disadvantages. DjVu is proprietary, though LizardTech makes available SDKs (software developer kits) to facilitate development of software for encoding and decoding. LizardTech will license the DjVu Reference Library only for noncommercial use. At this time LizardTech is the sole source of commercial DjVu products that adhere to the current standard. Though DjVu clearly has some very enthusiastic supporters, its adoption has been spotty. The DjVu Zone Web site (which has not been updated in over a year) includes an outdated list of current users. Two of the largest users cited, Heritage Microfilm's Historical Newspaper Archive and UMI's Early English Books Online have abandoned display of DjVu images in favor of PDF and GIF, respectively. It also offers limited metadata capacity. The format is not Web native and requires the installation of a browser plug-in for display purposes.