Our team at the Office of Public Health Genomics, CDC, has made the subset of PubMed abstracts related to human genome epidemiology more accessible and useful to interdisciplinary researchers by developing an integrated set of applications known collectively as the HuGE Navigator (http://www.hugenavigator.net). PubMed abstracts are the core data source; the subset has been supplemented weekly since 2001 by a combination of automated (machine learning) and expert curation. We have developed data and text mining algorithms to create a knowledge base for exploring genetic associations, candidate gene selection and investigator networks. NCBI E-utilities and UMLS are used to automate information extraction and retrieval in the system. Genetic information can be displayed whenever needed from major gene-centered databases (for example, Entrez Gene, Swiss-Prot, OMIM and GeneCards), as well as from databases of genetic variation and prevalence (for example, dbSNP and HapMap Project), pathways (for example, CGAP, KEGG and BioCarta), and other aspects (for example, Gene Ontology and Gene Clinics). The HuGE Navigator is constructed according to the principles of open source, standardization, interoperability and extensibility, so that new applications can be easily incorporated.

Share this project:

Updates