LZCL: An embedded compression library

Published 2016-08-29

Here’s an embedded compression format I designed to meet certain requirements. It’s small, fast, and efficient.

LZCL: An embedded compression library

Background

See my Compression Notes article for compression background on useful for this article. The README at the GitHub page has usage details .

For a recent project, I needed to design a low resource compression scheme. Over the years I’ve designed many, and this time I decided to unify some code, clean it up, and make some tools to ease development of such algorithms (mostly for my own use). The result this particular time is several very low memory decoders, necessarily slow, including Huffman, Arithmetic, LZ77, and my own new invention, LZCL, which outperforms the others quite well. All are designed for minimal RAM usage.

Implementation

LZ77 style compression creates a list of decisions (literal or token), a list of literals, and a list of tokens (each a distance and length pair). The lengths are skewed to short, and the distances slightly less so, so are amenable to compression. Often Huffman is used for one or both.

My method, LZCL, tries many types of compression on each stream (decisions,literal,distance,length) as well as on packed tokens (token = (length-minlength)*(maxdistance+1)+distance) and on decision runs (sort of a run-length-encoding for decisions), and selects the shortest for each. Compressors include Huffman, arithmetic, Golomb, and Binary Adaptice Sequential Coding (BASC). All header values are universal coded to avoid worrying about varying symbol lengths or other sizes.

All the choices are made for low memory decoding situations.

Decompression

The library is a single C file, Decompressor.c, with a single header file Decompressor.h, that implements multiple compressors in around 750 lines. <inttypes.h> is used to make integer sizes fixed by using, e.g., uint32_t wherever possible. Four main compressors, Huffman, Arithmetic, LZ77 (actually a LZSS variant), and my LZCL are implements. Each can be #ifdef’d out in the header.

All decompressors use variants of my Lomont1 universal coder to store varying length items such as data sizes, bit lengths, etc., since testing this against other universal coders showed mine worked better for common data uses.

Each decompressor includes a one pass to buffer form, and an incremental form using as low memory as possible.

Compression is generally not very fast, since the algorithms are designed to be easily extended, munged, and used to create more formats if needed. Decompression code is designed to be small, not fast, for the use case I wrote this code for. Decompressor.c fits all four decompression routines into a self contained 750ish line C file (~100 for Huffman, ~200 for arithmetic, ~75 for LZ77, ~250 LZCL, rest support).

Common code

The library implements a few utility routines, such as a function to compute bits required to store a non-negative integer $N$, $\beta(N)=\lfloor\log_2 N\rfloor+1$. There is a few functions to read bits from a bitstream, and a Lomont1 universal codec decoder.

Huffman

The Huffman coder is non-adaptive, in order to make low memory usage possible. Instead the symbol table is stored (in canonical format, see my compression format), and is parsed to decode each symbol. This, of course, makes the decompression slower than using a memory table, but this was the choice I needed.

The header is stored in fields, each adaptively stored to handle nay size

bits per symbol (usually 8, not always)
bits per codelength (usually 4-6)
Min length of a codeword (usually 1-4)
delta codeword length (usually 9-16)

Then the symbol table, which stores the number $N_j$ of codewords of length $L_j$, then $N_j$ symbols for these codewords. This is enough to decode.

Arithmetic

Similar to the Huffman version, this is designed to use as little RAM as possible, so the symbol table (which keeps symbols and frequencies needed for arithmetic decoding) in a special table format. This table stores the min and max symbols present, then for each in range, stores a Binary Adaptive Sequential Coded string of the values. The first value is Lomont1 encoded. When a value is needed to be looked up for decoding, the BASC sequence is walked till it is found.

Another problem with arithmetic decoders is knowing when to stop, and a common solution is to add a new End Of File token to the probabilities, which increases the overhead a little. I chose instead to store the length of the output, and the decoder outputs symbols until enough are decoded.

LZ77

The LZ77 coder is a LZSS variant, using a bit to decide if the next item is a literal or a (distance,length) pair.

After the header, a bit is read, and if 0 a symbol is decoded (of fixed minimal length), else if the bit was 1, a token is decoded, which is an integer, also of minimal length over the entire stream. The distance and length are decoded via

1
2


    uint32_t length = token / (actualMaxDistance + 1) + actualMinLength;
    uint32_t distance = token % (actualMaxDistance + 1);

Then a run of length bytes are output starting distance back in the buffer.

For slow memory, a user-supplied buffer of sufficient length is used cyclically.

LZCL

LZCL is a LZ77 variant. As such, it has to deal with decisions, literals, and (distance,length) tokens.

Decisions: 0/1 to tell next token type
Literals: single bytes encoded
Tokens: each is a distance, length pair, encoded in one of two methods

The compressor separates each into different streams, tries to compress each variant with each available compressor, and stores the minimal version.

The compression algorithms are currently

Fixed length
Huffman
Arithmetic
Golomb (see my Compression Notes article) with truncated coding

Decisions are considered either as 1 bit per decision, or as a list of runs of consecutive same values (basic run length encoding). Each of these is tested across each compressor.

Tokens, which are (distance,length) pairs, are considered as two separate streams or as one with the pair encoded as length*(maxDistance+1)+distance. All these are tested against each compressor.

Literals are similarly tested across each compressor.

The result is the minimal stream obtainable from the current methods.

All formats contribute to the LZCL results on the corpus tests below.

Results

On the Calgary and Cantebury corpus (selected since I am using smaller files for the embedded work - for larger corpii/corpuses/corp[.]+ use larger datasets), here are the results

Filename	File size	Arithmetic ratio	Huffman ratio	Lz77 ratio	LZCL ratio
Calgary\bib	111261	0.651	0.655	0.574	0.435
Calgary\book1	768771	0.566	0.570	0.661	0.486
Calgary\book2	610856	0.599	0.603	0.576	0.420
Calgary\geo	102400	0.709	0.711	0.898	0.645
Calgary\news	377109	0.649	0.654	0.648	0.454
Calgary\obj1	21504	0.755	0.759	0.648	0.527
Calgary\obj2	246814	0.784	0.787	0.484	0.361
Calgary\paper1	53161	0.626	0.629	0.535	0.426
Calgary\paper2	82199	0.577	0.580	0.571	0.448
Calgary\paper3	46526	0.586	0.588	0.568	0.474
Calgary\paper4	13286	0.596	0.598	0.543	0.477
Calgary\paper5	11954	0.627	0.629	0.564	0.478
Calgary\paper6	38105	0.630	0.633	0.561	0.431
Calgary\pic	513216	0.152	0.208	0.153	0.103
Calgary\progc	39611	0.653	0.657	0.502	0.417
Calgary\progl	71646	0.598	0.601	0.362	0.277
Calgary\progp	49379	0.611	0.614	0.351	0.275
Calgary\trans	93695	0.693	0.697	0.412	0.312
Cantebury\alice29.txt	152089	0.572	0.577	0.595	0.439
Cantebury\asyoulik.txt	125179	0.602	0.606	0.652	0.471
Cantebury\cp.html	24603	0.660	0.662	0.501	0.423
Cantebury\fields.c	11150	0.637	0.638	0.389	0.322
Cantebury\grammar.lsp	3721	0.603	0.604	0.435	0.394
Cantebury\lcet10.txt	426754	0.584	0.587	0.576	0.419
Cantebury\plrabn12.txt	481861	0.567	0.572	0.655	0.481
Cantebury\ptt5	513216	0.152	0.208	0.153	0.103
Cantebury\sum	38240	0.673	0.678	0.562	0.412
Cantebury\xargs.1	4227	0.634	0.633	0.517	0.474
Summary	5032533	0.518	0.533	0.510	0.374

Note that the LZ77 and LZCL output sizes can be further reduced by selecting an optimal max distance, max run length pair, but doing so optimally takes a bit of time, which I did not do for these tables. Additional savings can be on order of 10% or so. The command line tools allow exploring this.

For these 28 files , LZCL made the following choices

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


// choices and codec counts 
codec used: decision runs ArithmeticCodec  =>        5
codec used: decision runs HuffmanCodec     =>        3
codec used: decisions ArithmeticCodec      =>       20
codec used: distances ArithmeticCodec      =>       20
codec used: distances FixedSizeCodec       =>        8
codec used: lengths ArithmeticCodec        =>       18
codec used: lengths GolombCodec            =>        7
codec used: lengths HuffmanCodec           =>        3
codec used: literals ArithmeticCodec       =>       19
codec used: literals FixedSizeCodec        =>        5
codec used: literals HuffmanCodec          =>        4

// bits saved compared to a fixed size coded (high) and against second performing codec (low)
codec win ArithmeticCodec saved high       =>  8269290
codec win ArithmeticCodec saved low        =>   895235
codec win FixedSizeCodec saved high        => 32582974
codec win FixedSizeCodec saved low         =>   991038
codec win GolombCodec saved high           =>    90887
codec win GolombCodec saved low            =>      747
codec win HuffmanCodec saved high          => 18069716
codec win HuffmanCodec saved low           =>   865932

I also tested this on a PIC32, the desired platform, and obtained the following decompression performance:

The PIC32 sample code, running on a PIC32MX150F128B at 40MHZ, compiled with x32-gcc v1.31, -O3 optimized code, obtains the following decompression rates

Codec	Rate KB/s	Ratio
Huffman	74	67%
Arithmetic	3	66%
LZ77	327	53%
LZCL	16	42%
Incremental Huffman	74	67%
Incremental Arithmetic	3	66%
Incremental LZ77	285	53%
Incremental LZCL	16	42%

This was tested on a compressed version of the 26217 byte file Decompressor.c, with both LZ variants set to allow decoding incrementally into a 257 byte buffer.

Conclusion

The result was a success - I was able to fit all the magic into our upcoming product for my company Hypnocube, LLC! Please drop me a line is this code is useful to you.

Happy hacking! Chris Lomont

Categories: Software Electronics

Tags: Software Compression Embedded

Comments

Comment is disabled to avoid unwanted discussions from 'localhost:1313' on your Disqus account...