Smithsonian bioinformatician using high performance computing to annotate genomesSmithsonian bioinformatician using high performance computing to annotate genomes

Exploratory Science Projects

To learn more about GGI's exploratory science projects, click here

The Smithsonian’s National Museum of Natural History (NMNH) plays an important role in biodiversity genomics at the forefront of innovation and discovery in the 21st century. The Global Genome Initiative (GGI) represents the Museum’s commitment to lead the transformation of this revolutionary field. By launching GGI’s efforts to understand Earth’s biodiversity through sequencing projects, the museum plays a leading role in the growing field of biodiversity genomics.

Exploratory Science

GGI aims to increase the understanding of biodiversity genomics through the goal of advance third generation sequencing with technology partners. As of December 2018, GGI has directed a total of $350,000 to support 19 projects as part of GGI’s exploratory science program. These projects support work with technology partners to advance third generation sequencing and use this technology to answer important questions about biodiversity, such as how specific species groups fall in relation to other species groups at the molecular level, or how biogeographic and diversification patterns compare among ancient lineages of organisms.

The 19 exploratory science program projects are providing novel genomic resources for a breadth of the diversity of life in order to aid in the understanding of evolutionary relationships. These projects have generated targeted genomic resources (genomic markers) for 151 families, representing 569 species. Publication of these results began in 2018 and will continue through 2019.

Computational Resources

GGI supports computational resources and infrastructure by participating in system administration and management of Smithsonian centralized computing for genomics research. Computational resources at SI include centralized computing (Hydra-4; OCIO) encompasses 4,000 CPUs; 17 TB total RAM, including 1 node with 2TB of RAM, 2 nodes with 1TB RAM, 15 nodes with 512GB of RAM, and 45 nodes with 256 GB of RAM; 165TB of disk storage, as well as two new Intel nodes specific for genome assembly (1 TB of RAM; 72 CPUs, 2-4TB of solid-state disk). GGI computational research support has resulted in more than 26 publications since January 2015. Computational hours devoted to genomics on Smithsonian shared resources were over 1.8 million hours in 2018 alone.

Bioinformatics Workshops and Training

As a member of the Smithsonian Bioinformatics Working Group, GGI employees participate and lead genomic, data analysis, and Carpentries workshops for Smithsonian affiliates throughout the year. To see a list and materials used for past workshops click here:

Global Impact

GGI-supported genome sequencing, analysis and publication are maximized through partnerships. As of September 2018, a total of 947 eukaryotic families and 1,779 eukaryotic genera have been published to the National Center for Biotechnology Information (NCBI) Assembly database by principle investigators, globally.