Abstract
Candidate gene identification deals with associating genes to underlying biological phenomena, such as diseases and specific disorders. It has been shown that classes of diseases with similar phenotypes are caused by functionally related genes. Currently, a fair amount of knowledge about the functional characterization can be found across several public databases; however, functional descriptors can be ambiguous, domain specific, and context dependent. In order to cope with these issues, the Gene Ontology (GO) project developed a bio-ontology of broad scope and wide applicability. Thus, the structured and controlled vocabulary of terms provided by the GO project describing the biological roles of gene products can be very helpful in candidate gene identification approaches. The method presented here uses GO annotation data in order to identify the most meaningful functional aspects occurring in a given set of related gene products. The method measures this meaningfulness by calculating an e-value based on the frequency of annotation of each GO term in the set of gene products versus the total frequency of annotation. Then after selecting a GO term related to the underlying biological phenomena being studied, the method uses semantic similarity to rank the given gene products that are annotated to the term. This enables the user to further narrow down the list of gene products and identify those that are more likely of interest.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tabor, H. K., Risch, N. J., and Meyers, R. M. (2002) Candidate-gene approaches for studying complex genetic traits: practical considerations. Nat Rev Genet 3, 391–397.
Zhu, M., and Zhao, S. (2010) Candidate gene identification approach: progress and challenges. Int J Biol Sci 3, 420–427.
Oti, M., and Brunner, H. G. (2007) The modular nature of genetic diseases. Clin Genet 71, 1–11.
Bodenreider O., and Stevens, R. (2006) Bio-ontologies: current trends and future directions. Brief Bioinfo 7, 256–274.
Gene Ontology Consortium (2000) The gene ontology tool for the unification of biology. Nat Genet 25, 25–29.
Bada, M., Stevens, R., Goble, C., et al. (2004) A short study on the success of the Gene Ontology. Web Semantics: Science, Services and Agents on the World Wide Web, 2003 World Wide Web Conference, 1, 235–240.
Khatri, P. Draghici, S., Ostermeier, G. C., and Krawetz S. A. (2002) Profiling gene expression using onto-express. Genomics 79, 266–270.
Khatri, P., and Drăghici, S. (2005) Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 3587–3595.
Rivals, I., Personnaz, L., Taing, L., and Potier MC. (2007) Enrichment or depletion of a GO category within a class of genes: which test? Bioinformatics 23, 401–407.
Xu, T., Gu, J., Zhou, Y., and Du, L. (2009) Improving detection of differentially expressed gene sets by applying cluster enrichment analysis to Gene Ontology. BMC Bioinformatics 10, 240.
Charro, N., Hood, B. L., Pacheco, P., et al. (2011) Serum proteomics signature of cystic fibrosis patients: a complementary 2-DE and LC-MS/MS approach. J Proteome Res 74, 110–126.
Barrell, D., Dimmer, E., Huntley, R. P., et al. (2009) The GOA database in 2009 – an integrated Gene Ontology Annotation resource. Nucleic Acids Res 37, D396–D403.
Pesquita, C., Faria, D., Bastos, H., et al. (2008) Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinformatics 9, S4.
Rada, R., Mili, H., Bicknell, E., and Blettner, M. (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybernet 19, 17–30.
Wu, Z., and Palmer, M. S. (1994) Verb semantics and lexical selection. Proceedings of the 32nd. Annual Meeting of the Association for Computational Linguistics (ACL 1994). pp. 133–138.
Resnik, P. (1995) Using information content to evaluate semantic similarity in a taxonomy. Proceedings of the 14th International Joint Conference on Artificial Intelligence. Montreal, Quebec: Canada.
Couto, F. M., Silva, M. J., and Coutinho, P. M. (2005) Semantic similarity over the gene ontology: Family correlation and selecting disjunctive ancestors. Proceedings of the ACM Conference in Information and Knowledge Management. Bremen: Germany.
Lin, D. (1998) An information-theoretic definition of similarity. Proceedings of the 15th International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann. pp. 296–304.
Jiang, J., and Conrath, D. (1997) Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of the 10th International Conference on Research on Computational Linguistics, Taiwan.
Pesquita, C., Faria, D., Falcão, A. O., et al. (2009) Semantic Similarity in Biomedical Ontologies. PLoS Comput Biol 5, e1000443.
Gentleman, R. (2005) Visualizing and Distances Using GO. URL http://www.bioconductor.org/docs/vignettes.html.
Lord, P., Stevens, R., Brass, A., and Goble, C. (2003) Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation. Bioinformatics 19, 1275–1283.
Guo, X., Liu, R., Shriver, C. D., Hu, H., and Liebman, M. N. (2006) Assessing semantic similarity measures for the characterization of human regulatory pathways. Bioinformatics 22, 967–973.
Aranda, B., Achuthan, P., Alam-Faruque, Y., et al. (2010) The IntAct molecular interaction database in 2010. Nucleic Acids Res 38(Database issue), D525–D531.
UniProt Consortium (2010) The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res 38(Database issue), D142–D148.
Faria, D. Pesquita, C., Couto F. M. and Falcão A. (2007) ProteInOn: a web tool for protein semantic similarity. DI/FCUL TR 07-6, Department of Informatics, University of Lisbon.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Bastos, H.P., Tavares, B., Pesquita, C., Faria, D., Couto, F.M. (2011). Application of Gene Ontology to Gene Identification. In: Yu, B., Hinchcliffe, M. (eds) In Silico Tools for Gene Discovery. Methods in Molecular Biology, vol 760. Humana Press. https://doi.org/10.1007/978-1-61779-176-5_9
Download citation
DOI: https://doi.org/10.1007/978-1-61779-176-5_9
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-61779-175-8
Online ISBN: 978-1-61779-176-5
eBook Packages: Springer Protocols