I am working in computational structural biology. I develop novel algorithm for predicting and studying the structure of proteins and RNAs. Both aspects of my research are described below.

Proteins

I develop a framework for modeling and predicting the structure of transmembrane proteins and implement these algorithms in a software named partiFold. The method aims to compute quickly (couple of minutes) accurate three-dimensional structure predictions of long polypeptides (several hundreds of residues) without using any known structure.
Using tree representations of the structures, partiFold apply classical parsing algorithms to compute the best trees that can be built upon an input sequence, allowing a complete exploration of the structure space in polynomial time and space. Then, the trees are converted in three-dimensional structure. The following Figure illustrates the approach.

partiFold is also designed to provide a realistic picture of the folding landscape by computing macroscopic behaviors of the ensemble of folds from microscopic properties of residues. Instead of focusing the analysis on a single minimum folding energy structure, it uses statistical mechanics techniques to compute properties of the ensemble of structures found at the equilibrium.
The broad range of predictions provided by partiFold includes, but is not limited to, (a) matrices of inter-residue contact probabilities, (b) sequences profiles of β-strands propensity and (c) lists of candidate three-dimensionnal structures generated by a sampling algorithm.

a) Stochastic Contact Map
b) Contact Entropy Profile

c) 3D prediction
(Click on image to download the pdb file)

The method is now extended to other protein architectures by Charles W. O'Donnell. More informations can be found at: partiFold.csail.mit.edu.

RNAs

My current work aims to elucidate the relationship between RNA sequences and structures (a.k.a. sequence-structure maps). I develop a program RNAmutants which generalizes previous approaches by allowing a simultaneous exploration of the complete structure and mutation landscape in polynomial time ans space. The concept is described in the figure below. We map each sequence with all its possible secondary structures. Input sequence is at the center and concentric rings represent the neighborhood of sequences with one then two mutations.


(Click to enlarge)

For instance, RNAmutants allows us to investigate stability of the evolutionary conserved secondary structure elements. Here we show the distribution of thermodynamically favorable mutations on 3'UTR of GB virus C computed my program. x-axis represents the nucleotide index, and y-axis the mutation probability. Shaded regions are evolutionary conserved stem loop (SL) regions.


In addition, RNAmutants can be used to analyze the mutational robustness of a structure upon a sequence. In the Hepatitis C virus cis-acting replication element below blue labels stand for positions that can be mutated without disrupting the structure, while red ones are very sensitive. Green, yellow and orange are intermediate cases. The base pairs stability is indicated by the intensity of the bond.


More informations can be found at: RNAmutants.csail.mit.edu. The slides of my talk on RNAmutants at the 13th Annual Meeting of the RNA society are available HERE.