Next: About this document ... Up: No Title Previous: URLs

References

1
P. Agarwal and V. Bafna.
The ribosome scanning model for translation initiation for gene prediction and full-length cDNA detection.
In Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology, pages 2-7, 1998.

2
A. Bairoch.
Prosite: a dictionary of sites and patterns in proteins.
Nucleic Acids Research, 20:2013-2018, 1992.

3
O. Berg and P. von Hippel.
Selection of DNA binding sites by regulatory proteins.
J. Mol. Biol., 193:723-750, 1987.

4
A. P. Bird.
CpG islands as gene markers in the vertebrate nucleus.
Trends Genet, 3:342-347, 1987.

5
E. Birney and R. Durbin.
Dynamite: a flexible code generating language for dynamic programming methods used in sequence comparison.
In ismb97, pages 56-64, 1997.

6
E. Birney, J. Thompson, and T. Gibson.
PairWise and SearchWise: finding the optimal alignment is a simultaneous comparison of a protein profile against all DNA translation frames.
NAR, 24:2730-2739, 1996.

7
M. Borodovsky and J. McIninch.
Genmark: Parallel gene recognition for both DNA strands.
Computers and Chemistry, 17(2):123-133, 1993.

8
S. Brunak, J. Engelbrecht, and S. Knudsen.
Prediction of human mRNA donor and acceptor sites from the DNA sequence.
JMB, 220:49-65, 1991.

9
C. Burge and S. Karlin.
Predictions of complete gene structures in human genomic DNA.
JMB, 268:78-94, 1997.

10
M. Burset and R. Guigo.
Evaluation of gene structure prediction programs.
Genomics, 34(3):353-367, 1996.
Data set and evaluation results can be found at http://www.imim.es/GeneIdentification/Evaluation/Index.html.

11
J.-M. Claverie.
Sequence ``signals'': Artifact or reality?
Computers and Chemistry, 16(2):89-91, 1992.

12
J.-M. Claverie.
Some useful statistical properties of position-weight matrices.
Computers and Chemistry, 18(3):287-294, 1994.

13
J.-M. Claverie.
Computational methods for the identification of genes in vertebrate genomic sequences.
Human Molecular Genetics, 6(10):1735-1744, 1997.

14
S. Cole et al.
Deciphering the biology of mycobecterium tuberculosis from the complete genome sequence.
Nature, 393(6685):537-544, 1998.

15
M. Craven and J. Shavlik.
Learning to predict reading frames in e. coli DNA sequences.
In Proceedings of the Hawaii International Conference on System Sciences, pages 773-782, Los Alamitos, CA, 1993. IEEE Computer Society Press.

16
S. Dong and D. B. Searls.
Gene structure prediction by linguistic methods.
Genomics, 162:705-708, 1994.

17
R. Durbin, S. Eddy, A. Krogh, and G. Mitchison.
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids.
Cambridge University Press, 1998.

18
L. Duret, D. Mouchiroud, and C. Gautier.
Statistical analysis of vertebrate sequences reveals that long genes are scarce in CG-rich isochores.
Journal of Molecular Evolution, 40:308-317, 1995.

19
R. Farber, A. Lapedes, and K. Sirotkin.
Determination of eukaryotic protein coding regions using neural networks and information theory.
JMB, 226:471-479, 1992.

20
J. Fickett.
Finding genes by computer - the state of the art.
Trends in Genetics, 12(8):316-320, 1996.

21
J. Fickett.
The gene identification problem -- an overview for developers.
Computers and Chemistry, 20(1):103-118, 1996.

22
J. Fickett and A. G. Hatzigeorgiou.
Eukaryotic promoter recognition.
Genome Research, 7(9):861-878, 1997.

23
J. W. Fickett and C.-S. Tung.
Assessment of protein coding measures.
Nucl. Acids Res., 20:6441-6450, 1992.

24
C. Fields and C. Soderlund.
A practical tool for automating DNA sequence analysis.
Comp. Appl. Biosci., 6:263-270, 1990.

25
M. Gardiner-Garden and M. Frommer.
CpG islands in vertebrate genomes.
JMB, 196:261-282, 1987.

26
M. S. Gelfand.
Computer prediction of exon-intron structure of mammalian pre-mrnas.
NAR, 18:5865-5869, 1990.

27
M. S. Gelfand.
Prediction of function in DNA sequence analysis.
Jour. Comp. Biol., 2(1):87-115, 1995.

28
M. S. Gelfand, A. A. Mironov, and P. A. Pevzner.
Gene recognition via spliced sequence alignment.
PNAS, 93(17):9061-9066, 1996.

29
M. S. Gelfand, L. I. Podolsky, T. V. Astakhova, and M. A. Roytberg.
Recognition of genes in human DNA sequences.
Jour. Comp. Biol., 3(2):223-234, 1996.

30
M. S. Gelfand and M. A. Roytberg.
Prediction of the exon-intron structure by a dynamic programming approach.
BioSystems, 30:173-182, 1993.

31
R. Guigo.
Computational gene identification: an open problem.
Computers and Chemistry, 21(4):215-222, 1997.

32
A. Hatzigeorgiou and M. Reczko.
Recognition of coding regions and reading frames in DNA.
In Gene-Finding and Gene Structure Prediction Workshop, 1995.

33
J. Henderson, S. Salzberg, and K. Fasman.
Finding genes in human DNA with a hidden Markov model.
Journal of Computational Biology, 4(2):119-126, 1997.

34
X. Huang, M. Adams, H. Zhou, and A. Kerlavage.
A tool for analyzing and annotating genomic sequences.
Genomics, 46:37-45, 1997.

35
J. Jurka, P. Klonowski, V. Dagman, and P. Pelton.
Censor - a program for identification and elimination of repetitive elements from DNA sequences.
Computers and Chemistry, 20(1):119-122, 1996.

36
J. Jurka, J. Walichiewicz, and A. J. Milosavljevic.
Prototypic sequences for human repetitive DNA.
J. Mol. Evol., 35:286-291, 1992.

37
T. Klinger and D. Brutlag.
Detection of correlations in tRNA sequences with structural implications.
In L. Hunter, D. Searls, and J. Shavlik, editors, ISMB-93, Menlo Park, 1993. AAAI Press.

38
N. Kolchanov et al.
GenExpress: a computer system for description, analysis, and recognition of regulatory sequences in eukaryotic genome.
In Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology, pages 95-104, 1998.

39
M. Kozak.
Interpreting cDNA sequences: some insights from studies on translation.
Mammalian Genome, 7:563-574, 1996.

40
A. Krogh.
Two methods for improving performace of a HMM and their application for gene finding.
In Proceedings, 5th International Conference on Intelligent Systems for Molecular Biology, pages 179-186, 1997.

41
A. Krogh.
Gene finding: putting the parts together.
In M. J. Bishop, editor, Guide to Human Genome Computing, chapter 11, pages 261-274. Academic Press, 2nd edition, 1998.

42
A. Krogh, I. S. Mian, and D. Haussler.
A Hidden Markov Model that finds genes in E. coli DNA.
NAR, 22:4768-4778, 1994.

43
D. Kulp and D. Haussler.
Embedding HMMs: A Method for Recognizing Protein Homologs in DNA, 1997.
http://www.ornl.gov/hgmis/publicat/97santa/infortoc.html.

44
D. Kulp, D. Haussler, M. Reese, and F. Eeckman.
A generalized hidden Markov model for the recognition of human genes in DNA.
In ISMB-96, pages 134-142, St. Louis, June 1996. AAAI Press.
http://www.cse.ucsc.edu/~dkulp/cgi-bin/genie.

45
D. Kulp, D. Haussler, M. G. Reese, and F. H. Eeckman.
Integrating database homology in a probabilistic gene structure model.
In R. B. Altman, A. K. Dunker, L. Hunter, and T. E. Klein, editors, Proceedings of the Pacific Symposium on Biocomputing, pages 232-244. World Scientific, New York, 1997.

46
A. Lapedes, C. Barnes, C. Burks, R. Farber, and K. Sirotkin.
Application of neural networks and other machine learning algorithms to DNA sequence analysis.
In G. Bell and T. Marr, editors, Computers and DNA, SFI Studies in the Sciences of Complexity, volume VII, pages 157-182. Addison-Wesley, 1989.

47
F. Larsen, R. Gundersen, R. Lopez, and H. Prydz.
CpG islands as gene markers in the human genome.
Genomics, 13:1095-1107, 1992.

48
A. V. Lukashin and M. Borodovsky.
Genemark.hmm: new solutions for gene finding.
Nucleic Acids Research, 26(4):1107-1115, 1998.

49
S. Matis, Y. Xu, M. B. Shah, D. Buley, X. Guan, J. R. Einstein, R. J. Mural, and E. C. Uberbacher.
Detection of RNA Polymerase II Promoters and Polyadenylation Sites in Human DNA Sequence.
Computers and Chemistry, 20:135-140, 1995.

50
L. Milanesi and I. Rogozin.
Prediction of human gene structure.
In M. J. Bishop, editor, Guide to Human Genome Computing. Academic Press, 2nd edition, 1998.

51
A. Milosavljevic and J. Jurka.
Discovering simple DNA sequences by the algorithmic similarity method.
CABIOS, 9(4):407-411, 1993.

52
R. Nagel, A. Lancaster, and A. Zahler.
Specific binding of an exonic splicing enhancer by the pre-mrna splicing factor srp55.
RNA, 4:11-23, 1998.

53
C. M. O'Neill.
Training back-propagation neural networks to define and detect DNA-binding sites.
Nucl. Acids Res., 19:313-318, 1991.

54
C. M. O'Neill.
Escherichia coli promoters: neural networks develop distinct descriptions in learning to search for promoters of different spacing classes.
Nucl. Acids Res., 20:3471-3477, 1992.

55
G. R. and J. Fickett.
Distinctive sequence features in protein coding genic non-coding, and intergenic human DNA.
JMB, 253(1):51-60, October 13, 1995.

56
M. Reese and F. Eeckman.
Novel neural network prediction systems for human promoters and splice sites.
In Gene-Finding and Gene Structure Prediction Workshop, 1995.

57
M. G. Reese, F. H. Eeckman, D. Kulp, and D. Haussler.
Improved splice site detection in genie.
Jour. Comp. Biol., 4:311-323, 1997.

58
S. L. Salzberg.
Locating protein coding regions in human DNA using a decision tree algorithm.
Jour. Comp. Biol., 2:473-485, 1995.

59
S. L. Salzberg, A. L. Delcher, , S. Kasif, and O. White.
Microbial gene identification using interpolated Markov models.
Nucleic Acids Research, 26(2):544-548, 1998.

60
D. B. Searls.
The computational linguistics of biological sequences.
In L. Hunter, editor, Artificial Intelligence and Molecular Biology, chapter 2, pages 47-120. AAAI Press, 1993.

61
T. F. Smith and M. S. Waterman.
Comparison of bio-sequences.
Adv. Appl. Math, 2:482-489, 1981.

62
E. Snyder and G. Stormo.
Indentification of protein coding regions in genomic DNA.
JMB, 248:1-18, 1995.

63
V. Solovyev, S. A., and C. Lawrence.
Predicting internal exons by oligonucleotide composition and discriminant analysis of splicable open reading frames.
Nucl. Acids Res., 22:5156-5163, 1994.

64
R. Staden.
Computer methods to locate signals in nucleic acid sequences.
NAR, 12:505-519, 1984.

65
R. Staden.
Finding protein coding regions in genomic sequences.
Methods in Enzymology, 183:163-180, 1990.

66
G. Stormo.
Consensus patterns in DNA.
Methods in Enzymology, 183:211-220, 1990.

67
G. D. Stormo.
Computer methods for analyzing sequence recognition of nucleic acids.
Annu. Rev. Biophys. Biophys. Chem., 17:241-263, 1988.

68
G. D. Stormo and D. Haussler.
Optimally parsing a sequence into different classes based on multiple types of information.
In ISMB-94, Menlo Park, CA, Aug. 1994. AAAI/MIT Press.

69
J. Sulston et al.
The C. elegans genome sequencing project: A beginning.
Nature, 356:37-41, 1992.

70
A. Thomas and M. Skolnick.
A probabilistic model for detecting coding regions in DNA sequences.
IMA Journal of Mathematics Applied in Medicine and Biology, 11:149-160, 1994.

71
E. C. Uberbacher and R. J. Mural.
Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach.
PNAS, 88:11261-11265, 1991.

72
T. Wu.
A segment-based dynamic programing algorithm for predicting genes.
Jour. Comp. Biol., 3:375-394, 1996.

73
Y. Xu, J. R. Einstein, M. Shah, and E. C. Uberbacher.
An improved system for exon recognition and gene modeling in human DNA sequences.
In ISMB-94, pages 376-383, Menlo Park, CA, 1994. AAAI/MIT Press.

74
Y. Xu, R. Mural, and E. Uberbacher.
Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags.
In Proceedings, 5th International Conference on Intelligent Systems for Molecular Biology, pages 344-353, 1997.

75
Y. Xu and E. C. Uberbacher.
Automated gene identification in large-scale genomic sequences.
Journal of Computational Biology, 4(3):325-338, 1997.

76
M. Zhang and T. Marr.
A weighted array method for splicing and signal analysis.
CABIOS, 9:499-509, 1993.

77
M. Q. Zhang.
Identification of protein coding regions in the human genome based on quadratic discriminantanalysis.
PNAS, 94:559-564, 1998.

78
M. Q. Zhang.
Statistical features of human exons and their flanking regions.
Human Molecular Genetics, 7(5):919-932, 1998.


David Haussler
10/14/1998