The steering committee of the Global Alliance for Genomics and Health (GA4GH) has approved five new standards for genomic and phenotypic data sharing. Four of the standards were released in October, during the GA4GH annual meeting.

GA4GH represents more than 50 institutions that are international leaders in genomics. As part of the GA4GH product approval process, all approved standards have undergone detailed security and ethics reviews and have been implemented at two or more leading genomics institutions. The four new standards and their uses are described below.

  • Variant Representation v1. This standard provides a flexible framework of computational models, schemas, and algorithms to precisely and consistently exchange genetic variation data across communities.
  • Phenopackets v1. This standard is for sharing disease and phenotype information in order to improve the ability to understand, diagnose, and treat both rare and common diseases.
  • Crypt4GH v1. This is a standard container file format that enables genomic data to remain secure throughout their lifetime, from initial sequencing to sharing with professionals at external organizations.
  • Tool Registry Service API v2. This standard supports the portable exchange of tools and workflows, enabling different repositories to communicate with one another while recognizing that there are many valid approaches and design decisions that can go into a registry project.

In addition to the four released standards, GA4GH approved a new version of the organization’s genomics and health technology guidelines. This document provides a foundation of security technology recommendations to those developing and implementing GA4GH standards, with a goal of engendering responsibility and trust across the genomics community.

The new guides add to GA4GH’s growing list that includes standards for streaming genomic data and for accessing the reference genome, Beacon API for discovering genomic data around the globe, and a machine-readable data use ontology to help automate data access. Widely used file formats for storing DNA sequence reads and variant information—SAM/BAM/CRAM and VCF/BCF, respectively—are also on GA4GH’s list.

For more information visit Global Alliance for Genomics and Health.

Featured image: Illustration by Stephanie Li courtesy GA4GH.