Shaun Mahony


Computer Science and Artificial Intelligence Laboratory
Massachusetts Institute of Technology
32-G534, 32 Vassar Street, Cambridge, MA 02139
email: mahony(AT)mit.edu
phone: +1-412-818-1860


Programs

STAMP is a webserver resource for aligning transcription factor DNA-binding motifs. Input motifs may be aligned against each other using a wide choice of comparison metrics and alignment strategies. A multiple alignment, familial binding profile, and similarity tree are also produced from the set of input motifs. STAMP also matches each of the input motifs against a choice of databases of known TF binding motifs. STAMP allows the input of many different motif formats, including the input of entire output files from a number of supported motif-finders. In this way, STAMP provides a valuable resource for those researchers who wish to interpret their motif-finding results; such users may simply analyze their results using STAMP to see if any of their newly discovered motifs are similar to any known binding preferences.

SOMBRERO is a motif-finder that is based on the Self-Organizing Map neural network algorithm. In contrast to other probabilistic motif discovery tools, SOMBRERO poses motif-finding as a clustering problem. SOMBRERO therefore simultaneously estimates all motif signals in the input sequences (regulatory signals are separated from others during post-processing), as opposed to estimating each "significant" signal one-by-one. This clustering approach to motif-finding is undoubtedly more computationally costly than more traditional approaches. However, the great advantage of the approach is that multiple instances of prior knowledge may be used to initialize the motif-search. Prior knowledge of the impanted motif has often been show to significantly improve the accuracy of motif-finders. Of course, in typical de novo motif searches, we do not know what type of signal we are looking for. Traditional motif-finders may only incorporate one prior at a time, so the application of priors to motif-finding has been limited to those rare cases where certain motif signals are expected. SOMBRERO is the first motif-finder that can incorporate knowledge of all known motifs at the start of the motif search.

RescueNet is an attempt to use the Self-Organizing Map neural network algorithm for codon usage anaysis and gene-prediction. In its gene prediction functionality, RescueNet can estimate multiple models of gene codon usage properties during training. This is expected to offer advantageous gene-finding performance in cases where a diverse number of codon usage patterns are displayed. Examples include metagenomic datasets and prokaryotic genomes where mutational pressure, translational efficiency and horizontal gene transfer have diversified the displayed codon usage patterns.