In 1994, Burrows and Wheeler introduced the Burrows-Wheeler Transform (BWT). Using the BWT they gave two new lossless text compression algorithms. A well known implementation of these algorithms is bzip2, which is installed in most unix environments. BWT-based text compressors are particularly easy to implement, and give quite good compression ratios. For example, bzip2 typically compresses text files to about 80% of what gzip does.
In the first part of the talk I will give a detailed presentation of the Burrows-Wheeler Transform and of the compression algorithms based on it, along with an intuitive explanation of why they work well.
In the second part of the talk I will present upper bounds and lower bounds on the compression ratio of BWT-based compressors, as a function of the (k -th order) entropy of the input stream.
Lastly, I'll present a research direction which I find particularly interesting.
The talk is intended for a general computer-science audience. No background on compression algorithms is assumed.
This is based on joint work with Haim Kaplan and Shir Landau.
Elad Verbin has done his Ph.D. studies in Tel Aviv University, and is now a postdoc at ITCS, Tsinghua University. He is interested in combinatorial algorithms and in randomized algorithms.