Giovanni Manzini, University of Piemonte Orientale, Italy

Design and Engineering of String Algorithms
In these lectures we will review some recent results that have shed new light on the field of string algorithms. These results have shown that some classical string problems can be solved efficiently working on a compressed representation of the input string.

In the first lecture we will describe the Burrows-Wheeler Transform showing that it naturally leads to the design of simple data compression algorithms whose compression ratio is close to the information theoretical minimum. Then, we show how this transformation can be used to design compressed self -indices that take space close to that of the compressed text, replace it, and provide fast search over it.

In the second lecture we will discuss the algorithmic problem of efficiently building self-indices for large collections. The challenge here is to devise algorithms which are fast in practice, have guaranteed worst case performance, and use as little working memory as possible.

In the third lecture we will consider the problem of working with strings that have a non-linear structure such as two-dimensional tables and XML-documents. We show that recently developed algorithms can take advantage of these additional structures to further improve the performance of compression and searching algorithms.

References
G. Navarro, V. Makinen Compressed full-text indexes, ACM Computing Surveys Volume 39 , Issue 1 (2007) http://doi.acm.org/10.1145/1216370.1216372

S. Puglisi, W. Smyth, A. Turpin, A taxonomy of suffix array construction algorithms, ACM Computing Surveys Volume 39 , Issue 2 (2007) http://doi.acm.org/10.1145/1242471.1242472

A. Buchsbaum, G. Fowler, R. Giancarlo Improving table compression with combinatorial optimization, J. of ACM Vol. 50 (2003) http://doi.acm.org/10.1145/950620.950622

P. Ferragina, R. Giancarlo, G. Manzini and M. Sciortino, Boosting Textual Compression in Optimal Linear Time, J. of ACM, Vol. 52 (2005), pp. 688-713.

P. Ferragina, R. Giancarlo and G. Manzini, The Myriad Virtues of Wavelet Trees, Proc. ICALP 06 , Lecture Notes in Computer Science, Vol. 4051 (2007), 560-571.

P. Ferragina, F. Luccio, G. Manzini, S. Muthukrishnan. Structuring labeled trees for optimal succinctness, and beyond. Proc. 46th IEEE Symposium on Foundations of Computer Science (FOCS '05). http://dx.doi.org/10.1109/SFCS.2005.69

P. Ferragina, G. Manzini, S. Muthukrishnan (Eds) Theoretical Computer Science Special Issue devoted to the Burrows-Wheeler Transform and its Applications In press.