"Stochastic Context-Free Grammars for tRNA Modeling"
Y. Sakakibara, M. Brown, R. Hughey, S. Mian, K. Sjölander
R. Underwood, and D. Haussler.
Abstract:
Stochastic context-free grammars (SCFGs) are applied to the problems
of folding, aligning and modeling families of tRNA sequences. SCFGs
capture the sequences' common primary and secondary structure and
generalize the hidden Markov models (HMMs) used in related work on
protein and DNA. Results show that after having been trained on as
few as 20 tRNA sequences from only two tRNA subfamilies (mitochondrial
and cytoplasmic), the model can discern general tRNA from similar-length
RNA sequences of other kinds, can find secondary structure of new tRNA
sequences, and can produce multiple alignments of large sets of tRNA
sequences. Our results suggest potential improvements in the alignments
of the D- and T-domains in some mitochdondrial tRNAs that cannot be fit
into the canonical secondary structure.