Can Newgen revolutionise document compression?
In a market already dominated by the monopoly of compression schemes
like CCITT G4, JPEG, JBIG2 and LZW, the viability of a new product is a question mark. But
with a superior technological advancement coupled with ingenious business strategy,
document management and imaging solutions provider Newgen Software is fully geared to give
tough competition to the existing image compression algorithms, says Shipra Arora
|
According to Hareish Gur,
NIF will not only offer the
advantage of higher
compression but also the ability to do so without losing text and colour resulting in high
resolution |
Looking at the enormous growth in volumes of data, any advancement in
document compression technology is a milestone. What makes compressed images an important
business need is the fact that documents are being constantly archived, communicated and
manipulated in digital format and there is a growing demand for instant access to high
quality documents.
Newgens recent technology innovation in colour document compression
called Newgen Image File-Format (NIF), a result of two-and-a-half years of research and
development efforts by the Advanced Image Processing Group, is believed to offer
compression rates of up to 300 times. Hareish Gur, group head & deputy general
manager, Advanced Imaging Group, Newgen Software Technologies says that NIF will be both a
compression scheme and a file format (.NIF), just like JPEG. What will drive competition,
however, is the companys strategy of piggybacking on competition by integrating the NIF
compression scheme into Adobes PDF format. These factors can help in establishing the
NIF brand name. The company is even considering patenting the technology, though it has
not taken a decision on this as yet. With the beta version being released this month, this
technology will be commercially available by early next quarter.
Whats new in NIF?
According to Gur, NIF is an open standard format that allows
ultra-high compression ratios for scanned colour and grayscale office documents without
losing text legibility and OCRability of the scanned document.
But the real benefit will depend on the quality of images because
generally it has been observed that higher the compression greater the distortion and
blocky effects in the images. "NIF will not only offer the advantage of higher
compression but also the ability to do so without losing text and colour resulting in high
resolution," explains Gur.
Some of the business benefits to the users are in terms of both cost
savings in storage and time saving in transferring colour documents to Internet. This is
even more critical considering the bandwidth scenario, as a majority of users still
dial-up for Internet connectivity. A smaller image size will mean that images can be
easily and instantly transmitted and viewed via standard Web browsers thereby resulting in
efficient scanning, storing, downloading and emailing mission critical documents via
corporate Intranets or even the Internet. These are the benefits that the company will
have to explain to users to get them on board, says an industry expert.
How does it work?
The technology works on multi-layer compression. The scanned document
is separated into multiple layersa layer containing high-resolution text (or hard
edges); one layer of low-resolution background; another layer containing colours and soft
edges. Then each layer is compressed separately according to an algorithm that yields the
best results for image size and clarity. This is done on the basis of analytical strengths
of the technology. The technology uses JPEG and JPEG 2000 for lossy and CCITT G4 and JBIG2
for lossless compression.
(Graphics compression techniques are of two types : lossless and
lossy. Lossless techniques throw away redundant bits of information without affecting the
quality of the image, but lossy techniques while reducing file size compromise on image
quality.)
Based on end user requirements, mix and match of these
combinations is possible without hindering interoperability. According to Gur, all these
standards being open, there is no imposition of proprietary format and thus user
confidence is boosted. While the encoder for bitonal areas can encode losslessly, the
background layer is a lossy compression. The text layer is neither touched for resolution
reduction, nor for any lossy operation resulting in a clear digital document which retains
the quality of the original scanned document at high compression ratios. The most critical
stage in the process of creating a NIF/PDF file is the ability to separate the foreground,
background, colour and other parts via advanced image processing techniques known as
segmentation.
Competitive scenario
There are three major players in the compression market, namely CCITT
G4, JPEG and LZW. While CCITT G4 is a black & white (B&W) compression standard,
the latter two are colour compression standards. As per industry estimates almost 90
percent of the compression market is still dominated by the B&W standard because of
higher costs and prohibitory file sizes involved in colour compression.
While on one hand NIF will be facing competition from the B&W
document compression market, on the colour front it will have JPEG and LZW compression
schemes to contend with. The 90 percent B&W market is also a potential market which
NIF can target and try to move towards colour. What will work to NIFs advantage is the
increasing adoption of colour document imaging technologies by the business world, with
the choice of storing it at the same cost of the commonly used B&W standard (CCITT
G4-compressed TIFF). According to Gur, the size of a NIF compressed office document will
be almost same as a CCITT G4-compressed document.
On the colour front, JPEG has the disadvantage of being a lossy
compression scheme, which means that there is loss of content/information during
compression process. The size is less but the quality suffers. As a result the JPEG scheme
is not very readily used in document compression and medical imaging. Apart from this, the
JPEG compression scheme is only supported by file formats like .JPG, .JPEG and .JFIF.
Similarly there are only three file formats, namely .GIF, .TIF and .PDF, which use LZW
compressed schemes. On the other hand, the first release of NIF itself will support file
formats like .NIF (its own file format), .PDF, .TIFF, .BMP, .PNG, .GIF, which means that
all these formats can be opened with the same viewer. This is done by converting the
various formats into .NIF or NIF compressed .PDF formats.
Vis-a-vis the LZW compression scheme, NIF is an open standards-based
scheme. This means it will be available for all to read and implement and will create a
fair, competitive market for implementations of the standard. Thereby not locking in the
customer into a particular vendor or group and maximising end-user choice. Being an open
standard, NIF will be free for all to implement, with no royalty or fee. Newgen will be
making a restricted version of NIF available as a freeware on the Internet for individual
users. However, the SDK and advanced level viewer will be priced. On the other hand for
LZW, the joint patent owners CompuServe and Unisys are into an agreement whereby they
agree to encourage the GIF developers who use CompuServe as a distributor to pay a royalty
fee to Unisys. For each registered copy of a program that uses the LZW compression
technology, the developer pays 1.5% of the sale price of the program to CompuServe, or
$0.15, whichever is greater.
However, what could make matters a little difficult for NIF is the
fact that in June 2003 the patent for LZW will be expiring making it freely available.
This will mean that people will be able to use .GIF file formats, etc. without paying. But
this does not seem to deter Newgen. The company points out that price tag wont
drive the competition. "Competition will largely depend on who offers better
compression schemesboth in terms of compression size and quality," adds Gur. LZW
cannot offer more than 8-bit colour per pixel for .GIF and .PDF. Its because 24-bit
colour compression tends to increase the size of the image. NIF, however, will be able to
offer 24-bit colour at an optimised size through the segmentation process. It will
also offer the 8-bit colour per pixel choice.
Business strategy
Breaking into the technology domain of CCITT-G4, JPEG and LZW will,
however, not be easy. Despite its technology strength, competition is tough for NIF to
establish itself among already established technologies. The failure of an almost similar
document compression offering from a US-based company LizardTech, (which had acquired DjVu
colour document compression technology from AT&T Labs) to garner expected market share
tells adequately on the tough market scenario and the competition Newgen will have to
face.
It portends that more than the technology, the company has to get its
business strategy right. Learning a lesson from LizardTechs case, Newgen think tank has
tried to induce more flexibility and risk capacity into its NIF strategy. NIF being an
open standard format, Newgen has decided to use it to leverage on Adobes PDF market
share as well, thereby increasing the scope of addressing the market. This means that not
only will Newgen be able to address the untapped market, but also cater to Adobes
market. Gur explains that NIF technology complements Abobes technology by packaging
multiple NIF layers into a PDF file, in compliance with Adobes specs for PDF creation.
The resulting NIF-compressed PDF can be opened in a Acrobat Reader. This will provide
Newgen access to millions of desktops worldwide having Acrobat Reader installed on them,
which otherwise wasnt possible. With an advantage of a readymade market to the company
it will make its business strategy more risk-free.
According to an expert, what ails LizardTech is its proprietary format
due to which it has not been able to take PDFs market share head on. Besides, the
pricing of DjVu was also prohibitively high ($20,000 for SDK and about four cents per
document conversion fee). Learning from this, Newgen has kept its options open. It means
that the user has the choice of saving the file (BMP, TIFF, etc.) either in Newgens
.NIF format or Adobes .PDF format (NIF compressed). Though the NIF compressed PDF files
are slightly larger than corresponding .NIF files, the conversion from .NIF to .PDF and
vice-versa is a fast and non-lossy process. Typically, a 25 MB (A4, 300 DPI colour
document) uncompressed BMP file when converted into NIF compressed PDF file will occupy
only about 20 KB more than corresponding .NIF file (the size of the .NIF file being around
100 KB). Scoring one-up on the strategy front here, this is the spot where the company
feels it will hit competition the most.
Business strategy is also designed towards generating multiple revenue
streams. The companys revenue model comprises the following: -
- Bundling the technology along with scanners
- Releasing viewers
- Releasing a development suite providing a wide range of tools &
APIs for integrating into any third party application.
The company is also looking at integrating the technology into its
mainline solution OmniDocs, a document management system, to begin with. The restricted
freeware version on the Internet will enable the company to get the possible users to
experiment with the technology, ultimately leading to adoption of the complete version.
The software package for viewing, generation and distribution of
lightweight colour document images, which will be available as NIFView, is an image
viewer that supports opening and saving of files like NIF, PDF, TIFF, BMP, PNG, GIF, etc.
On the other hand companys other offering - NIF SDK will allow integration of NIF
technology into all applications. It will include a collection of ActiveX controls,
automation servers, Applets and platform-independent APIs for viewing, loading, saving,
extracting text layers, annotating, etc. External OCR engines can be fed the binary text
layer directly, thereby enabling faster and more accurate results vis-a-vis threshold or
binary-scanned image. (OCR engines normally work on binary images only.)
Bundling with scanners and MFDs will be another important revenue
generation model for the company. It is already in talks with scanner and MFD vendors
globally for bundling NIF technology with their product lines. The company has received
positive responses from various potential partners. "All big companies, whom we have
met and demonstrated the software, are gung-ho about it and want to bundle our technology
with their scanners and MFDs," says Gur. According to him, if the vendors pay
royalty, the technology will be theirs otherwise it will remain Newgens brand.
Application Areas
Some of the application areas that the company will be targeting are
advertising, publication, distribution and scan-to-web applications and back file
conversion of colour publications and documents, workflow applications etc. It also caters
to a whole range of B2C and B2B applications, from financial record storage and
distribution to online publishing, online retailing, web publishing, e-book publishing.
The files in NIF format can be easily put onto the website or embedded in HTML documents.
Newgen will be targeting segments like BPO, telecom and insurance sectors, which are
likely to be early beneficiaries of NIF document compression technology. It will also be
targeting libraries, SME and SOHO segments.
Final Word
According to an IDC estimate, by 2004 there will be 19 million flatbed
scanners. This is the size of one of the potential markets for NIF technology. In addition
to this increasing thrust towards colour document imaging by enterprises will mark the way
for NIF technology. However, its easier said than done. A lot will depend on how well
the company is able to market the technology and establish the NIF brand strongly among
CCITT G4, JPEG and LZW. Still to come are the counter strategies adopted by the
competitors. |