Strings, Trees, and RNA Folding
By their nature, biological sequences are often abstracted to combinatorial
structures: strings over finite alphabets and their representation as
trees and other graphs. The interaction between discrete mathematics and
molecular biology has been especially fruitful in the case of RNA secondary
structures, since the formation of Watson-Crick base pairs is a strong,
discrete thermodynamic interaction. Yet, understanding the folding of
an RNA sequence into a set of nested base pairs remains a fundamental
open problem in RNA molecular biology. This tutorial will address some
of the many combinatorial results both motivated by and with applications to
questions about the base pairing of RNA sequences. In one direction,
combinatorics has been successfully applied to fundamental problems
in RNA secondary structures, for instance in work such as:
Y. Bakhtin and C.E. Heitsch.
Large deviations for random trees and the branching of {RNA}
secondary structures.
To appear in Bulletin of Mathematical Biology.
H.H. Gan, S. Pasquali, and T. Schlick.
Exploring the repertoire of {RNA} secondary motifs using graph
theory; implications for {RNA} design.
Nucleic Acids Res, 31(11):2926--43, June 2003.
S.Y. Le, R. Nussinov, and J.V. Maizel.
Tree graphs of {RNA} secondary structures and their comparisons.
Comput Biomed Res, 22(5):461--473, October 1989.
In the other direction, questions about RNA secondary structures have
motivated new combinatorial theorems, for example in papers such as:
C.E. Heitsch.
Counting orbits under {K}reweras complementation.
Submitted.
I.L. Hofacker, P. Schuster, and P.F. Stadler.
Combinatorics of {RNA} secondary structures.
Discrete Appl. Math., 88(1-3):207--237, 1998.
W.R. Schmitt and M.S. Waterman.
Linear trees and {RNA} secondary structure.
Discrete Appl. Math., 51(3):317--323, 1994.