##
Combinatorial Problems on Strings with Applications to Protein Folding

**
Alantha Newman and Matthias Ruhl
**
*
Latin American Theoretical INformatics (LATIN '04)*

Buenos Aires, April 2004

[Postscript, 357KB]

[PDF, 195KB]

### Abstract

We consider the problem of protein folding in the HP model on the 3D
square lattice. This problem is combinatorially equivalent to
folding a string of 0's and 1's so that the string forms a
self-avoiding walk on the lattice and the number of adjacent pairs
of 1's is maximized. The previously best-known approximation
algorithm for this problem has a guarantee of 3/8 = .375. In this
paper, we first present a new 3/8-approximation algorithm for the 3D
folding problem that improves on the absolute approximation
guarantee of the previous algorithm. We then show a connection
between the 3D folding problem and a basic combinatorial problem on
binary strings, which may be of independent interest. Given a binary
string in {*a*,*b*}^{*}, we want to find a long
subsequence of the string in which every sequence of consecutive
*a*'s is followed by at least as many consecutive
*b*'s. We show a non-trivial lower-bound on the existence of
such subsequences. Using this result, we obtain an algorithm with a
slightly improved approximation ratio of at least .37501 for the 3D
folding problem. All of our algorithms run in linear time.

Back to publications