Data processing: structural design – modeling – simulation – and em – Modeling by mathematical expression
Patent
1998-01-14
2000-10-03
Teska, Kevin J.
Data processing: structural design, modeling, simulation, and em
Modeling by mathematical expression
703 11, 702 20, G06F 1710
Patent
active
061285870
ABSTRACT:
An system and methodology procedure agglomeratively estimates a phylogenetic tree from MSA input data by creating a data model represented by each tree node by first estimating the number of independent observations in the data. A preferably relative entropy distance measurement made among nodes between subtrees determines which nodes in the model to merge at each agglomeration step. Cuts in the phylogenetic tree are made at points in the agglomeration at which minimized encoding cost is determined, preferably by using Dirichlet mixture densities to assign probabilities to observed amino acids within each subfamily at each position. Using subtree data, a statistical model, e.g., a profile or hidden Markov model, for each subfamily may be constructed in a position-dependent manner, which permits identifying remote homologs in a database search. Further, the invention provides an alignment analysis to identify key functional or structural residues. Finally, the invention may be carried out in automated fashion using a computer system in which a processor unit executes a storable routine embodying the preferred methodology.
REFERENCES:
patent: 5325445 (1994-06-01), Herbert
patent: 5655058 (1997-08-01), Balasuabramanian et al.
patent: 5701256 (1997-12-01), Marr et al.
patent: 5832182 (1998-11-01), Zhang et al.
patent: 5864810 (1999-01-01), Digalakis et al.
patent: 5867402 (1999-02-01), Schneider et al.
patent: 5912989 (1999-06-01), Watanabe
Wu et al.;"Gene Family Identification Network Design", Proceedings of the IEEE International Joint Symposia on Intelligence and Systems, pp. 103-110, May 1995.
Sankoff et al.,"Probability Models for Genome Rearrangement and Linear Invariants for Phlyogentic Inference", Proceedings of the Third Annual International Conference on Computational Molecular Biology, pp. 302-309, Jan. 1999.
Craven et al.,"Machine Learning Approaches to Gene Recognition", IEEE Expert, vol. 9, Issue 2, pp. 2-10, Apr. 1994.
Hirschberg et al.,"Kestrel: A programmable Array for Sequence Analysis", Proceedings of the International Conference on Applications Specific Systems, Architectures and Processors, pp. 25-34, Aug. 1996.
Yokomri et al., "Learning Local Languages and their Application to DNA Sequence Analysis", IEEE Transactions on Pattern Analysis and Machine Learning, vol. 20, Issue 10, pp. 1067-1079, Oct. 1998.
Krogh et al., "Hidden Markov Models in Computational Biology: Application to Protein Modeling", J. Mol. Biol. 235, pp. 1501-1531, Feb. 1994.
Kawahara et al.,"HMM Based on Pair-wise Bayes Classifiers", IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 365-368, Mar. 1992.
Henikoff, S., "Comparative Methods for Identifying Functional Domains in Protein Sequences", Biotechnology Annual Review, pp. 129-147, 1995.
Eddy et al., "Maximum Discrimination Hidden Markov Models of Sequence Consensus", Journal of Computational Biology, vol. 2, No. 1, pp. 9-23, Spring 1995.
Parker et al., "HomolgyPlot: Searching for Homology to a Family of Proteins Using a Database of Unique Conserved Patterns", Jour. of Comp. Aided Molecular Design, vol. 8, No. 2, Apr. 1994, pp. 193-210.
Asai et al., "Prediction of Protein Secondary STructure by the Hidden Markov Model", Comp. Applications in the Biosciences, vol. 9, No. 2, pp. 141-146, Apr. 1993.
Hughey et al., "Hidden Markov Models for Sequence Analysis: Extension and Analysis: Extension and Analysis of the Basic Method", Comp. Applications in the Biosciences, vol. 12, No. 2, pp. 95-107, Apr. 1996.
Sergent Douglas W.
Teska Kevin J.
The Regents of the University of California
LandOfFree
Method and apparatus using Bayesian subfamily identification for does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus using Bayesian subfamily identification for, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus using Bayesian subfamily identification for will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-204407