Data Sources

Monarch integrates a variety of externally curated data sources, primarily focused on genotype- and disease-phenotype associations. The following are a list of the sources currently integrated and available for browsing in our website. We are continually adding new sources...if there is a data source you wish to be incorporated into our site, please contact us.

Source Description How we use it Data categories Ontologies/Vocabularies Date Updated
Mouse Genome Informatics
MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human health and disease.
We list genotype-phenotype associations and asserted disease-models. We use MGI identifiers as the main hook into mouse data. Additionally, we utilize the Mouse Phenotype Ontology (MP) in our cross-species ontology, to link all mouse phenotype annotations from various sources, which is developed by MGI.genotype-phenotype association, disease-model associationECO, MA, MP, SOMarch 1, 2017
Zebrafish Information Network
The Zebrafish Information Resource is the community database resource for the laboratory use of zebrafish which develops and supports integrated zebrafish genetic, genomic and developmental information, maintains the definitive reference data sets of zebrafish research information toward facilitation of the use of zebrafish as a model for human biology.
We integrate the curated genotype-phenotype data, including experimentally derived fish (such as via application of morpholinos), and links to the literature as evidence.genotype-phenotype association, experimental reagents (morpholinos)ECO, PATO, SO, ZFA, ZFS, ZPMarch 1, 2017
WormBase database of nematode biology
WormBase is an international consortium dedicated to providing the research community with accurate, current, accessible information concerning the genetics, genomics and biology of C. elegans and related nematodes.
Wormbase curates variant (allele)-phenotype associations. The variants are both genetic (intrinsic) and induced through application of reagents such as RNAi (extrinsic). We list the variant-phenotype associations. Some data is pulled from WormBase directly, other data is routed via WormMine.allele-phenotype associationWBbt, WBls, WBPhenotypeMarch 1, 2017
FlyBase
FlyBase is the model organism database providing integrated genetic, genomic, phenomic, and biological data for Drosophila melanogaster.
We integrate the genotype-phenotype associations.genotype-phenotype associationFBbt, FBcv, FBdvMarch 1, 2017
International Mouse Phenotyping Consortium
The International Mouse Phenotyping Consortium (IMPC) is generating a knockout mouse strain for every protein coding gene by using the embryonic stem cell resource generated by the International Knockout Mouse Consortium (IKMC). Systematic broad-based phenotyping is performed by each IMPC center using standardized procedures found within the International Mouse Phenotyping Resource of Standardised Screens (IMPReSS) resource. Gene-to-phenotype associations are made by a versioned statistical analysis.
We use the allele-phenotype associations recorded by the consortium. In addition, we map their allele+zygosity+background to MGI genotype identifiers. Where they do not map to MGI genotype identifiers, we create temporary identifiers for navigation purposes.genotype-phenotype associationMPMarch 1, 2017
Mouse Phenome Database
The Mouse Phenome Database is a collaborative standardized collection of measured data on laboratory mouse strains, and includes: baseline phenotype data sets, studies of drug, diet, disease and aging effect, protocols, projects, and publications, and SNP, variation and gene expression studies. MPD collects data for classical inbred strains, other fixed-genotype strains, derived lines and populations that are openly acquirable (strain panel examples). Strains can be from JAX-Mice or from any other vendor that's a recognized breeding source.
We compute the strain-phenotype associations for extreme outliers (>3 s.d.) based on comparison to all strains tested. We link out to MPD where more detail can be found for each of the experimental protocols used to quantify the results and map the measurements to the resulting qualitative phenotypes.genotype (strain)-phenotype associationMPMarch 1, 2017
Online Mendelian Inheritance in Animals
Online Mendelian Inheritance in Animals (OMIA) is a catalogue/compendium of inherited disorders, other (single-locus) traits, and genes in 215 (non-model) animal species.
Animal species and breeds with high-incidence of disease, and with links to human OMIM diseases, are listed as animal models for that disease.gene-disease associationOMIMMarch 1, 2017
ClinVar
ClinVar archives and aggregates information about relationships among variation and human health. ClinVar collects reports of variants found in patient samples, assertions made regarding their clinical significance, information about the submitter, and other supporting data.
We utilize the asserted disease-gene, variant-disease, and variant-gene associations, together with their evidence.disease-gene association, variant-disease association, variant definitionsUMLSMarch 1, 2017
Mendelian Inheritance in Man
OMIM is a comprehensive, authoritative compendium of human genes and genetic phenotypes with full-text, referenced overviews that contains information on all known mendelian disorders and over 12,000 genes. OMIM focuses on the relationship between phenotype and genotype.
We use curated disease-gene, disease-locus, and variant-disease associations, together with their annotated references. Most OMIM diseases are further curated by the HPO group. Most OMIM diseases are integrated into the Disease Ontology.gene-disease association, variant-disease associationOMIMMarch 1, 2017
ORPHANET
Orphanet provides reference information on rare diseases and orphan drugs to help improve the diagnosis, care and treatment of patients with rare diseases.
We use the Orphanet disease-gene associations.disease-gene associationHPOMarch 1, 2017
Protein ANalysis THrough Evolutionary Relationships Classification System
The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System was designed to classify proteins (and their genes) according to evolutionary family/subfamily, molecular function, biological process, and pathway. The PANTHER Classifications are the result of human curation as well as sophisticated bioinformatics algorithms.
We currently utilize the 12 RefGenome species, as well as HUGO HCOP species, to seed the orthology calls. Species currently include: arabidopsis, budding yeast, chicken, chimp, dog, fission yeast, fruitfly, green lizard, horse, human, macaque, mouse, opossum, pig, platypus, rat, slime mold, worms, zebrafish. We use the orthology calls to populate the orthologs tabs for genes, as well as to infer disease-model associations via homology.orthologyROMarch 1, 2017
Coriell Institute for Medical Research
The Coriell Cell Repositories provide essential research reagents to the scientific community by establishing, verifying, maintaining, and distributing cell cultures and DNA derived from cell cultures. These collections, supported by funds from the National Institutes of Health (NIH) and several foundations, are extensively utilized by research scientists around the world. NINDS and NIGMS cell line catalog. NIGMS samples represent a variety of disease states, chromosomal abnormalities, apparently healthy individuals and many distinct human populations. NINDS samples are drawn from subjects with cerebrovascular disease, epilepsy, motor neuron disease, Parkinsonism and Tourette Syndrome, as well as controls.
We link pertinent cell lines to any diseases for which they are asserted models.disease-model associationOMIMMarch 1, 2017
Comparative Toxicogenomics Database
CTD promotes understanding about the effects of environmental chemicals on human health by integrating data from curated scientific literature to describe chemical interactions with genes and proteins, and associations between diseases and chemicals, and diseases and genes/proteins.
We integrate the asserted (curated) disease-gene associations, and their evidence.disease-gene associationMESH, OMIMMarch 1, 2017
HPO
A curated database of human hereditary syndromes from OMIM, Orphanet, and DECIPHER mapped to classes of the human phenotype ontology. Various meta-attributes such as frequency, references and negations are associated with each annotation. These are presently limited to rare mendelian diseases.
We use the HPO disease-phenotype annotations as the primary atomic description of a disease, and list them on the disease pages, together with their references. The Human Phenotype Ontology is integrated into our cross-species phenotype ontology.disease-phenotype associationECO, HPOMarch 1, 2017
Kyoto Encyclopedia of Genes and Genomes
KEGG is an integrated database resource consisting of the seventeen main databases including systems, genomic, chemical, and health information.
We list disease-gene associations, and gene-pathway associations. We utilize the KEGG Ortholog (KO) gene-pathway associations, and infer a specific-organisms' participation in that pathway based on the gene-KO links.gene-pathway association, disease-gene association, orthologyMESH, OMIMMarch 1, 2017
MyGene
MyGene.info provides a simple-to-use REST web services to query/retrieve gene annotation data.
We use mygene.info's REST-services to fetch and display curated RefSeq gene descriptions.gene definitionMarch 1, 2017
National Center for Biotechnology Information
Gene integrates information from a wide range of species, and includes nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide. Taxon lists the taxinomic organization of organisms. Pub2Gene serves links between genes and pubmed identifiers where they are mentioned.
We use NCBIGene ids and symbols as the primary identifier and label for human genes in our system and NCBITaxon identifiers and scientific name for species-specific labeling. For any given gene, we also list the annotated pmids from Pub2Gene.gene definition, taxon definition, gene-publication associationMarch 1, 2017
BioGRID
BioGRID is an curated gene and protein interaction repository with for major model organism species.
Monarch indicates gene-gene/protein-protein interactions on gene pages. We also use many of the id mappings to resolve ids in our own site.protein-protein interactionMIMarch 1, 2017
GWAS Catalog
The NHGRI-EBI Catalog of published genome-wide association studies.
Monarch links the variants recorded here to the curated EFO-classesvariant-phenotype variant-disease associationsRO, EFO, ECOMarch 1, 2017
AnimalQTLdb
The Animal Quantitative Trait Loci (QTL) database (Animal QTLdb) is designed to house publicly all available QTL and single-nucleotide polymorphism/gene association data on livestock animal species.
Monarch uses the QTL genetic maps and their computed genomic locations to create associations between the QTLs and their traits. The traits come in their internal Animal Trait ontology vocabulary, which they further map to [Vertebrate Trait](http://bioportal.bioontology.org/ontologies/VT), Product Trait, and Clinical Measurement Ontology vocabularies.qtl-trait associationsRO, ECOMarch 1, 2017
Ensembl database of automatically annotated genomic data
Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation.
Monarch obtains equivalencies between Ensembl gene IDs and NCBI gene IDsNCBI gene to Ensembl gene ID mappingsMarch 1, 2017
Gene Ontology Database
The GO defines concepts/classes used to describe gene function, and relationships between these concepts.
Monarch processes gene-process/function/subcellular/location associations.gene-process/function/subcellular/locationROMarch 1, 2017
Gene Reviews
GeneReviews, an international point-of-care resource for busy clinicians, provides clinically relevant and medically actionable information for inherited conditions in a standardized journal-style format, covering diagnosis, management, and genetic counseling for patients and their families.
Monarch processes the GeneReviews mappings to OMIM, plus inspect the GeneReviews (html) books to pull the clinical descriptions in order to populate the definitions of the terms in the ontology. We define the GeneReviews items as classes that are either grouping classes over OMIM disease ids (gene ids are filtered out), or are made as subclasses of DOID:4 (generic disease).Disease ID mappingMarch 1, 2017
HUGO Gene Nomenclature Committee
A curated online repository of HGNC-approved gene nomenclature, gene families and associated resources.
Monarch creates equivalences between HGNC identifiers and ENSEMBL and NCBIGene. We also add the links to cytogenic locations for the gene features.Gene ID mappingMarch 1, 2017
Mutant Mouse Resource and Research Centers
A repository of mouse stocks and ES cell line collections serving the world-wide genetics and biomedical research community for the benefit of human health.
Monarch processes the Mutant Mouse Resource and Research Center strain data, which includes: strains and their mutant alleles, phenotypes of the alleles, and descriptions of the research uses of the strains.Strain-phenotype associationsMPMarch 1, 2017
Reactome - a curated knowledgebase of biological pathways
Reactome is a free, open-source, curated and peer reviewed pathway database.
Monarch processes ensembl gene to pathway associationsgene-pathway associationsROMarch 1, 2017
Undiagnosed Diseases Program (UDP)
The National Institutes of Health (NIH) Undiagnosed Diseases Program (UDP) is part of the Undiagnosed Disease Network (UDN), an NIH Common Fund initiative that focuses on the most puzzling medical cases referred to the NIH Clinical Center in Bethesda, Maryland.
Monarch stores phenotypes for each case and variants of interestcase-variant, case-phenotype associationsROMarch 1, 2017
STRING
STRING is a database of known and predicted protein-protein interactions.
Monarch stores protein protein interactions with experimental/assay evidenceprotein-protein interactionsROMarch 1, 2017