Data compression in Java
In this tutorial, we'll be looking at how to compress data in Java using the
built-in compression library. The standard JDK includes the Deflater
class for general-purpose compression. This is an implementation of the DEFLATE
algorithm (actually a wrapper around the commonly-used zlib library), which
can reduce many types of data to between 20% and 50% of its original size.
We'll look at what types of data it favours in a moment.
If you need data compression and you have little development time, then
passing your data through Deflater will give some compression for many
types of data. On the next page, we'll delve straight in and look at
how to use Deflater to compress data in
Java "out of the box".
Reading common archive files
Java provides out-of-the-box support for reading GZIP files
and ZIP files, which are both based on the DEFLATE
algorithm. Unfortunately, for reading tar files in Java, reading encrypted
ZIP files, and indeed other types of archive, Java doesn't give out-of-the-box
support for these. For these latter types, you can use the
Java archive reader Arcmexer, which you can download free
of charge from this web site. This allows you to read encrypted ZIPs in Java,
along with tar and gzipped tar archives (tarballs).
Advanced uses: understand the compression algorithm
For more advanced cases, it can help to know a little about how the
compression algorithm works. We'll take an how
the Deflater works in broad terms.
Then we'll also see how to apply this knowledge:
if you have little bit more development time available,
it may be possible to transform your input data to improve the
compression ratio offered by the Deflater class.
We look at the case of imroving text compression by
using the FILTERED strategy
on pre-transformed data.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.