Login [Center] Logout Join Us Guidelines  I  中文  I  CQI

Text-Compression using the Burrows-Wheeler Transform

Speaker: Elad Verbin ITCS, Tsinghua University
Time: 2007-09-14 14:40-2007-09-14 16:40
Venue: FIT Building 4-603, Tsinghua University
Download: Click!

Abstract:

In 1994, Burrows and Wheeler introduced the Burrows-Wheeler Transform (BWT). Using the BWT they gave two new lossless text compression algorithms. A well known implementation of these algorithms is bzip2, which is installed in most unix environments. BWT-based text compressors are particularly easy to implement, and give quite good compression ratios. For example, bzip2 typically compresses text files to about 80% of what gzip does.
        In the first part of the talk I will give a detailed presentation of the Burrows-Wheeler Transform and of the compression algorithms based on it, along with an intuitive explanation of why they work well.
        In the second part of the talk I will present upper bounds and lower bounds on the compression ratio of BWT-based compressors, as a function of the (k -th order) entropy of the input stream.
        Lastly, I'll present a research direction which I find particularly interesting.
        The talk is intended for a general computer-science audience. No background on compression algorithms is assumed.
        This is based on joint work with Haim Kaplan and Shir Landau.

Short Bio:

Elad Verbin has done his Ph.D. studies in Tel Aviv University, and is now a postdoc at ITCS, Tsinghua University. He is interested in combinatorial algorithms and in randomized algorithms.