Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology

Brody, Jennifer A; Morrison, Alanna C; Bis, Joshua C; O'Connell, Jeffrey R; Brown, Michael R; Huffman, Jennifer E; Ames, Darren C; Carroll, Andrew; Conomos, Matthew P; Gabriel, Stacey; Gibbs, Richard A; Gogarten, Stephanie M; Gupta, Namrata; Jaquish, Cashell E; Johnson, Andrew D; Lewis, Joshua P; Liu, Xiaoming; Manning, Alisa K; Papanicolaou, George J; Pitsillides, Achilleas N; Rice, Kenneth M; Salerno, William; Sitlani, Colleen M; Smith, Nicholas L; Heckbert, Susan R; Laurie, Cathy C; Mitchell, Braxton D; Vasan, Ramachandran S; Rich, Stephen S; Rotter, Jerome I; Wilson, James G; Boerwinkle, Eric; Psaty, Bruce M; Cupples, L Adrienne

doi:10.1038/ng.3968

Commentary
Published: 27 October 2017

Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology

Jennifer A Brody ORCID: orcid.org/0000-0001-8509-148X¹^na1,
Alanna C Morrison²^na1,
Joshua C Bis¹^na1,
Jeffrey R O'Connell³,
Michael R Brown²,
Jennifer E Huffman ORCID: orcid.org/0000-0002-9672-2491⁴,
Darren C Ames⁵,
Andrew Carroll⁵,
Matthew P Conomos⁶,
Stacey Gabriel⁷,
Richard A Gibbs⁸,
Stephanie M Gogarten ORCID: orcid.org/0000-0002-7231-9745⁶,
Namrata Gupta⁷,
Cashell E Jaquish⁹,
Andrew D Johnson⁴,
Joshua P Lewis³,
Xiaoming Liu ORCID: orcid.org/0000-0001-8285-5528²,
Alisa K Manning^10,11,12,
George J Papanicolaou⁹,
Achilleas N Pitsillides⁴,
Kenneth M Rice ORCID: orcid.org/0000-0002-3071-7278⁶,
William Salerno⁸,
Colleen M Sitlani¹,
Nicholas L Smith^1,13,14,15,
NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium,
The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium,
TOPMed Hematology and Hemostasis Working Group,
CHARGE Analysis and Bioinformatics Working Group,
Susan R Heckbert^1,15,
Cathy C Laurie⁶,
Braxton D Mitchell^3,16,
Ramachandran S Vasan^4,17,18,
Stephen S Rich¹⁹,
Jerome I Rotter²⁰,
James G Wilson²¹,
Eric Boerwinkle^2,8^na2,
Bruce M Psaty^1,13,22^na2 &
…
L Adrienne Cupples^4,23^na1^na2

Nature Genetics volume 49, pages 1560–1563 (2017)Cite this article

4107 Accesses
66 Citations
20 Altmetric
Metrics details

Subjects

Abstract

The increasing volume of whole-genome sequence (WGS) and multi-omics data requires new approaches for analysis. As one solution, we have created the cloud-based Analysis Commons, which brings together genotype and phenotype data from multiple studies in a setting that is accessible by multiple investigators. This framework addresses many of the challenges of multicenter WGS analyses, including data-sharing mechanisms, phenotype harmonization, integrated multi-omics analyses, annotation and computational flexibility. In this setting, the computational pipeline facilitates a sequence-to-discovery analysis workflow illustrated here by an analysis of plasma fibrinogen levels in 3,996 individuals from the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) WGS program. The Analysis Commons represents a novel model for translating WGS resources from a massive quantity of phenotypic and genomic data into knowledge of the determinants of health and disease risk in diverse human populations.

You have full access to this article via your institution.

Download PDF

Main

The Analysis Commons, which relies on a new team-science model for genetic epidemiology, integrates multi-omic data and rich phenotypic and clinical information from diverse population studies into a single shared analytic platform that leverages the resources of a cloud-computing environment and allows for distributed access. The number of WGS studies with large sample sizes is rapidly expanding. Projects such as the NHLBI TOPMed Program, the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium^1,2 and the Centers for Common Disease Genomics (CCDG)³, among others, have already conducted WGS in more than 100,000 individuals, and the Personalized Medicine Initiative⁴ promises whole-genome sequencing in over a million samples. These programs span a diverse set of studies and institutions, many of which lack the computational infrastructure to store and compute on this scale of data. Genomic, epigenomic, metabolic and proteomic data derived from expensive assays often do not exist in large numbers in any single study but represent a powerful discovery resource when they are combined across studies and integrated with phenotypic data.

In aggregate, many population-based studies have collected data on tens of thousands of variables over a period of decades, and the addition of WGS data to cohorts with long-term prospective follow-up provides a powerful resource for immediate discovery. Analysis of WGS data for large samples presents formidable computational and administrative challenges. Evaluation of rare genetic variation in WGS data requires manipulation of data sets that are tens to hundreds of terabytes in size and are prohibitively large for exchange between analysis sites. In contrast, pooled data sets that include genotype and phenotype data from all participants in the contributing individual studies provide for practical and efficient WGS analysis. The creation of such large pooled data sets containing harmonized multi-omic, phenotype and clinical data with appropriate metadata (for example, parent-study information and use permissions) is difficult and time consuming.

Because it can provide extensive computational resources and can host many users, the cloud-computing environment serves as an excellent platform and infrastructure for the Analysis Commons. Instead of distributing copies of excessively large data sets to many analysts, the Analysis Commons uses a cloud-computing infrastructure providing both data and tools to many analysts. This setting, which incorporates collaborative resources and a team-science approach to discovery, permits nimble analyses and methodological developments.

Although existing studies provide valuable data, a major hurdle to the Analysis Commons is that these same studies have legacy data-sharing policies that were not developed with complex data sharing in mind. The Analysis Commons requires the ability not only to combine data across studies and institutions but also to share pooled data among participating investigators from multiple institutions. In addition, mechanisms must be in place to ensure that sensitive participant data are both accessible to authorized investigators and simultaneously protected by robust security protocols. To bring data and researchers from multiple studies together into the Analysis Commons, we implemented two methods for data security. The first involves individual studies' securing institutional approval to share data with a consortium through a single 'consortium agreement' rather than through the typical series of bilateral agreements. Under this model, the individual studies retain oversight over their shared data by way of a steering committee. The second model leverages the National Center for Biotechnology Information (NCBI) database of Genotypes and Phenotypes (dbGaP) system of controlled access to coordinate authorization and data sharing across the set of approved external collaborators. Both systems build upon well-used approval mechanisms but extend them to enable sharing among a broad group of investigators from multiple institutions.

In most cohort studies, some phenotypes have multiple repeated measures and may require several data types. For example, ascertainment of type 2 diabetes and its date of onset represent a combination of longitudinal glucose measures, medication use, self-reported measures and, in some cases, review of diagnostic codes from medical records. The Analysis Commons can accommodate multiple approaches to phenotype harmonization. The Working Group model, which has facilitated discovery in other settings¹, convenes investigators from multiple institutions who have content knowledge of a related set of phenotypes along with analytic or biostatistical expertise to develop analysis plans and consensus definitions for key analytic variables. Analysis plans often require harmonization of primary outcomes as well as eligibility criteria and exclusions. This approach leverages the knowledge of investigators from the contributing studies.

In the Analysis Commons, harmonized genomic and phenotypic data are available to authorized researchers to conduct genotype–phenotype analyses that require 'bursts' of intense computing. Implementing this workflow in a cloud environment can efficiently use on-demand computing capacity and thus avoid a costly build-out of local computing clusters at multiple institutions. The Analysis Commons also provides analysts with access to mature pipelines that represent the methods that have been tested and debugged, and are likely to become a standard in the field. Access is possible either through a web interface or through command-line batch processing. The logging of parameters and data-file identifiers used in analyses provides the provenance of results files and facilitates the reproducibility of analyses.

The Analysis Commons is designed to support a variety of software applications that have particular strengths, such as familial adjustment, analysis of time-to-event outcomes and computational optimization. Available applications for genetic association analyses currently include Genetic Estimation and Inference in Structured Samples (GENESIS)⁵, Mixed Model Analysis for Pedigrees and Populations (MMAP), Efficient and Parallelizable Association Container Toolbox (EPACTS) and seqMeta. Applications support the analysis of both related and unrelated individuals. The multiple-variant tests are flexibly designed so that variants can be aggregated by genes, by regulatory regions, by sliding windows or by user-defined motifs. Variants can be filtered or weighted according to annotations (for example, WGS Annotator (WGSA)⁶ or Cassandra⁷), which build on a base of common information such as conservation and functional protein predictions as well as extensive tissue-specific assays from projects such as the Encyclopedia of DNA Elements (ENCODE)⁸. By focusing on those variants with higher likelihoods to be functional for a given phenotype, these tools allow researchers to leverage their specific expertise in trait biology to improve power.

The setting of the Analysis Commons has the flexibility to serve phenotypic-driven research as well as to aid investigators in developing and testing new statistical methods and computational algorithms. Although analysis is more complicated to execute than a model that provides users with the results of predefined point-and-click analysis tools, methods development is made possible by full direct access to the combined data sets. Importantly, these new methods, which will be essential to leverage a growing collection of WGS data sets, can be readily benchmarked against established methods in a controlled environment and then rapidly distributed. For example, fastSKAT⁹, a methodological advance that greatly decreases the computational burden of the sequence kernel association test (SKAT)¹⁰ with large numbers of variants, was developed and validated in the Analysis Commons and benefits from access to sample data sets and benchmarking against standard SKAT implementations. This collaborative 'sandbox' assures the availability of the latest methods to interested investigators and provides researchers with full access to the individual-level data needed to drive discovery.

The use of modular-analysis applications (apps) implements particular operations that are chained together into pipelines (Fig. 1). As an example, we implemented one such pipeline for a sequence-to-discovery workflow, including (i) conversion of variant call format to a binary random-access genetic-storage format by using the SeqArray R package, (ii) single-variant and aggregate tests implemented through the GENESIS R package and (iii) visualization for quality control and display of the results. Apps for each step in the workflow were contributed by users at different institutions and coordinated through the Apps Development Working Group, thus demonstrating that the Analysis Commons allows for greater collaboration in both development and analysis. This pipeline is publically available on DNAnexus (Supplementary Note). All analyses were performed in parallel in an independently developed MMAP pipeline, which allowed for not only validation of the methods and results but also benchmarking of computing parameters.

The Analysis Commons is currently implemented in DNAnexus, which is built on Amazon Web Services. Data from 12 studies from 2 large WGS efforts, CHARGE and TOPMed, are combined and made accessible to authorized study investigators. Data sets are held securely within the DNAnexus platform for genomic-data management and analysis, which is independently certified as compliant to relevant research and clinical regulations (including ISO 27001, HIPAA, CLIA, CAP and GCP). For the purpose of illustration, we integrated data from 2 of the 12 studies with measured plasma fibrinogen levels—the Old Order Amish Study and the Framingham Heart Study—to analyze genetic association with fibrinogen levels in 3,996 study participants (Supplementary Note). The participating studies and the analysts received institutional approval via a consortium agreement to share phenotype and genotype data and perform analyses within the Analysis Commons. The analyses used linear mixed models that were adjusted for family structure through an empirical kinship matrix. Single-variant regression analyses assessed associations with common variants (i.e., those with a minor allele count ≥5). After correction for the number of variants tested (n = 13,742,969), we identified a low-frequency variant with a two-tailed score test (rs148685782[G>C] (p.Ala108Gly); P = 2.51 × 10⁻⁹, MAF = 0.34%), a previously identified¹¹ nonsynonymous variant in FGG (Fig. 2), the gene encoding the gamma chain of the fibrinogen glycoprotein. Rare variants (MAF <5%) were limited to those with a Combined Annotation-Dependent Depletion (CADD)¹² phred score ≥10 and were tested in aggregate within sequential 50-kb windows (Fig. 2b). No windows were genome-wide significant after Bonferroni correction. These analyses benefited from the extensive computing resource. For example, the GENESIS SKAT analyses that used 380 CPU hours were run in approximately 1 hour of wall-clock time. The analyses were validated by running both GENESIS and MMAP applications by analysts from separate institutions.

**Figure 2: Plasma fibrinogen association results.**

We present a model that builds a collaboration among researchers with the common goal of multicenter genomic epidemiology research. The oversight of the Analysis Commons requires the management of four activities: (i) data access, (ii) phenotype harmonization, (iii) app development and (iv) analysis. The management is shared among several committees and Working Groups. These components of the Analysis Commons are designed to flexibly accommodate teams that may work on subprojects with distinct permissions, data sets and analytic approaches. Team members participate in the Analysis Committee, wherein researchers present work in progress focusing on ongoing challenges in analytic methods and discuss data-set curation and availability, as well as annotation resources. Similarly, the membership of the Apps Development Working Group is drawn from the phenotype-driven Working Groups and focuses on the development and testing of software for use across the Analysis Commons and eventual release to the broader scientific community. Although project teams primarily work independently on their research aims, communication among investigators through joint teleconferences, real-time messaging and in-person training seminars is key to successful collaboration. Large multistudy collaborations and big-data efforts are the next stage in contemporary genetics. With the Analysis Commons, we present a blueprint for how to navigate the practical issues of both large-scale computing and collaboration that are facing many studies, and the analytic code and data-sharing mechanisms that can be adopted by other investigators. The Analysis Commons is a resource for many research groups, through direct collaboration, established committees or parallel adoption of the governance model and the developed apps.

The Analysis Commons is one model for the translation of WGS resources from a massive quantity of raw data into a better understanding of the determinants of health in diverse human populations. Strong infrastructure support is needed for analysis of these WGS data in a setting that allows for phenotype, analytic and computational experts to convene and address these questions. This environment should enable and accelerate the promise of precision medicine to provide the right treatment at the right time and to tailor treatments to patients' individual needs.

URLs. GENESIS, http://bioconductor.org/packages/release/bioc/html/GENESIS.html; MMAP, https://github.com/MMAP/; EPACTS, http://genome.sph.umich.edu/wiki/EPACTS; seqMeta, https://cran.r-project.org/web/packages/seqMeta/index.html; SeqArray, https://www.bioconductor.org/packages/release/bioc/html/SeqArray.html; DNAnexus, https://www.dnanexus.com/; Analysis Commons GitHub, https://github.com/AnalysisCommons/; Analysis Commons analysis tools, https://platform.dnanexus.com/projects/F2KK1b80zzK7vb0G0qb8fJvk/; Analysis Commons public site, http://analysiscommons.com/.

Data availability. All data generated or analyzed during this study are included in this published article and its supplementary information.

Author Contributions

S.R.H., C.C.L., B.D.M., J.R.O., R.S.V., S.S.R., J.I.R., J.G.W., J.A.B., A.C.M., J.C.B., E.B., B.M.P., L.A.C. and K.M.R. formed the management team of the Analysis Commons. J.A.B., A.C.M., J.C.B., E.B., B.M.P., L.A.C., S.R.H. and X.L. drafted the manuscript. W.S., R.A.G., A.C. and D.C.A. managed the computing infrastructure. A.K.M., J.R.O., M.R.B., D.C.A., A.C., M.P.C., S.M.G. and A.N.P. were responsible for the implementation and design of the applications. S.G. and N.G. oversaw the sequence generation. J.A.B., J.E.H., J.P.L., A.D.J., J.C.B., C.M.S., N.L.S., C.E.J. and G.J.P. conceived, designed and implemented the example data analyses. All coauthors reviewed and edited the manuscript before approving its submission.

References

Psaty, B.M. et al. Circ Cardiovasc Genet 2, 73–80 (2009).
Article Google Scholar
Morrison, A.C. et al. Nat. Genet. 45, 899–901 (2013).
Article CAS Google Scholar
Fuchsberger, C. et al. Nature 536, 41–47 (2016).
Article CAS Google Scholar
Sankar, P.L. & Parker, L.S. Genet. Med. 19, 743–750 (2017).
Article Google Scholar
Zheng, X. et al. Bioinformatics 33, 2251–2257 (2017).
Article Google Scholar
Liu, X. et al. J. Med. Genet. 53, 111–112 (2016).
Article CAS Google Scholar
Reid, J.G. et al. BMC Bioinformatics 15, 30 (2014).
Article Google Scholar
ENCODE Project Consortium Nature 489, 57–74 (2012).
Lumley, T., Brody, J.A., Peloso, G.M. & Rice, K. Preprint at https://www.biorxiv.org/content/early/2016/11/04/085639/ (2016).
Wu, M.C. et al. Am. J. Hum. Genet. 89, 82–93 (2011).
Article CAS Google Scholar
Huffman, J.E. et al. Blood 126, e19–e29 (2015).
Article CAS Google Scholar
Kircher, M. et al. Nat. Genet. 46, 310–315 (2014).
Article CAS Google Scholar

Download references

Acknowledgements

TOPMed. WGS for the TOPMed program was supported by the NHLBI. WGS for 'NHLBI TOPMed: Whole Genome Sequencing and Related Phenotypes in the Framingham Heart Study' (phs000974.v1.p1) and 'NHLBI TOPMed: Genetics of Cardiometabolic Health in the Amish' (phs000956.v1.p1) was performed at the Broad Institute of MIT and Harvard (HHSN268201500014C and 3R01HL121007-01S1 (NHLBI, B.D.M.)). Centralized read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1). Phenotype harmonization, data management, sample-identity quality control and general study coordination were provided by the TOPMed Data Coordinating Center (3R01HL-120393-02S1 (NHLBI, B.M.P., K.M.R. and S.S.R.)). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. The infrastructure for the Analysis Commons is additionally supported by R01HL105756 (NHLBI, B.M.P.), U01HL130114 (NHLBI, B.M.P.) and 5RC2HL102419 (NHLBI, E.B.).

Old Order Amish Study. This investigation was supported by National Institutes of Health grants R01 HL121007 (NHLBI, B.D.M.), U01 GM074518, U01 HL084756 (NHLBI, J.R.O.), U01 HL137181 (NHLBI, J.R.O.) and K23 GM102678 (NIGMS, J.P.L.), as well as Mid-Atlantic Nutrition and Obesity Research Center grant P30 DK072488 (NIDDK, B.D.M.). We also gratefully acknowledge our Amish liaisons and field workers and the extraordinary cooperation and support of the Amish community.

Framingham Heart Study. The Framingham Heart Study was supported by the NHLBI Framingham Heart Study (contract no. N01-HC-25195 and HHSN268201500001I (NHLBI, R.S.V. and L.A.C.)), Fibrinogen measurement was supported by NIH R01-HL-48157. J.E.H. and A.D.J. were supported by NHLBI Intramural Research Program funds. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the NHLBI, the National Institutes of Health or the US Department of Health and Human Services.

Author information

Jennifer A Brody, Alanna C Morrison, Joshua C Bis and L Adrienne Cupples: These authors contributed equally to this work.
Eric Boerwinkle, Bruce M Psaty and L Adrienne Cupples: These authors jointly directed this work.

Authors and Affiliations

Department of Medicine, Cardiovascular Health Research Unit, University of Washington, Seattle, Washington, USA
Jennifer A Brody, Joshua C Bis, Colleen M Sitlani, Nicholas L Smith, Susan R Heckbert & Bruce M Psaty
Department of Epidemiology, Human Genetics Center, Human Genetics, and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, USA
Alanna C Morrison, Michael R Brown, Xiaoming Liu & Eric Boerwinkle
Department of Medicine, Division of Endocrinology, Diabetes, and Nutrition, University of Maryland, Baltimore, Maryland, USA
Jeffrey R O'Connell, Joshua P Lewis & Braxton D Mitchell
Framingham Heart Study, National Heart, Lung, and Blood Institute and Boston University, Framingham, Massachusetts, USA
Jennifer E Huffman, Andrew D Johnson, Achilleas N Pitsillides, Ramachandran S Vasan & L Adrienne Cupples
DNAnexus, Inc., Mountain View, California, USA
Darren C Ames & Andrew Carroll
Department of Biostatistics, University of Washington, Seattle, Washington, USA
Matthew P Conomos, Stephanie M Gogarten, Kenneth M Rice & Cathy C Laurie
Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA
Stacey Gabriel & Namrata Gupta
Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, USA
Richard A Gibbs, William Salerno & Eric Boerwinkle
Division of Cardiovascular Sciences, National Heart, Lung, and Blood Institute, Bethesda, Maryland, USA
Cashell E Jaquish & George J Papanicolaou
Center for Human Genetics Research, Massachusetts General Hospital, Boston, Massachusetts, USA
Alisa K Manning
Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
Alisa K Manning
Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA
Alisa K Manning
Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
Nicholas L Smith & Bruce M Psaty
Department of Veteran Affairs Office of Research and Development, Seattle Epidemiologic Research and Information Center, Seattle, Washington, USA
Nicholas L Smith
Department of Epidemiology, University of Washington, Seattle, Washington, USA
Nicholas L Smith & Susan R Heckbert
Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, Maryland, USA
Braxton D Mitchell
Department of Medicine, Sections of Preventive Medicine and Epidemiology, and of Cardiology, Boston University School of Medicine, Boston, Massachusetts, USA
Ramachandran S Vasan
Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts, USA
Ramachandran S Vasan
Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, USA
Stephen S Rich
Departments of Pediatrics and Medicine, Institute for Translational Genomics and Population Sciences, LABioMed at Harbor -UCLA Medical Center, Torrance, California, USA
Jerome I Rotter
Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, Mississippi, USA
James G Wilson
Departments of Medicine, Epidemiology, and Health Services, University of Washington, Seattle, Washington, USA
Bruce M Psaty
Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA
L Adrienne Cupples

Authors

Jennifer A Brody
View author publications
You can also search for this author in PubMed Google Scholar
Alanna C Morrison
View author publications
You can also search for this author in PubMed Google Scholar
Joshua C Bis
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey R O'Connell
View author publications
You can also search for this author in PubMed Google Scholar
Michael R Brown
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer E Huffman
View author publications
You can also search for this author in PubMed Google Scholar
Darren C Ames
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Carroll
View author publications
You can also search for this author in PubMed Google Scholar
Matthew P Conomos
View author publications
You can also search for this author in PubMed Google Scholar
Stacey Gabriel
View author publications
You can also search for this author in PubMed Google Scholar
Richard A Gibbs
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie M Gogarten
View author publications
You can also search for this author in PubMed Google Scholar
Namrata Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Cashell E Jaquish
View author publications
You can also search for this author in PubMed Google Scholar
Andrew D Johnson
View author publications
You can also search for this author in PubMed Google Scholar
Joshua P Lewis
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Liu
View author publications
You can also search for this author in PubMed Google Scholar
Alisa K Manning
View author publications
You can also search for this author in PubMed Google Scholar
George J Papanicolaou
View author publications
You can also search for this author in PubMed Google Scholar
Achilleas N Pitsillides
View author publications
You can also search for this author in PubMed Google Scholar
Kenneth M Rice
View author publications
You can also search for this author in PubMed Google Scholar
William Salerno
View author publications
You can also search for this author in PubMed Google Scholar
Colleen M Sitlani
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas L Smith
View author publications
You can also search for this author in PubMed Google Scholar
Susan R Heckbert
View author publications
You can also search for this author in PubMed Google Scholar
Cathy C Laurie
View author publications
You can also search for this author in PubMed Google Scholar
Braxton D Mitchell
View author publications
You can also search for this author in PubMed Google Scholar
Ramachandran S Vasan
View author publications
You can also search for this author in PubMed Google Scholar
Stephen S Rich
View author publications
You can also search for this author in PubMed Google Scholar
Jerome I Rotter
View author publications
You can also search for this author in PubMed Google Scholar
James G Wilson
View author publications
You can also search for this author in PubMed Google Scholar
Eric Boerwinkle
View author publications
You can also search for this author in PubMed Google Scholar
Bruce M Psaty
View author publications
You can also search for this author in PubMed Google Scholar
L Adrienne Cupples
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium

The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium

TOPMed Hematology and Hemostasis Working Group

CHARGE Analysis and Bioinformatics Working Group

Corresponding authors

Correspondence to Jennifer A Brody or L Adrienne Cupples.

Ethics declarations

Competing interests

B.M.P. reports serving on the data and safety monitoring board for a clinical trial funded by the manufacturer Zoll LifeCor and on the Steering Committee for the Yale Open Data Access Project funded by Johnson & Johnson. J.R.O. has a consulting agreement with Regeneron Pharmaceuticals that focuses on development of statistical analysis and software tools. A.C. and D.C.A. are employed by DNAnexus.

Additional information

A full list of members and affiliations appears in the Supplementary Note.

Supplementary information

Supplementary Text and Figures

Supplementary Note (PDF 158 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brody, J., Morrison, A., Bis, J. et al. Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology. Nat Genet 49, 1560–1563 (2017). https://doi.org/10.1038/ng.3968

Download citation

Published: 27 October 2017
Issue Date: 01 November 2017
DOI: https://doi.org/10.1038/ng.3968

This article is cited by

ARFID Genes and Environment (ARFID-GEN): study protocol
- Cynthia M. Bulik
- Nadia Micali
- James J. Crowley
BMC Psychiatry (2023)
Genomics and Functional Genomics of Alzheimer's Disease
- M. Ilyas Kamboh
Neurotherapeutics (2022)
FAIRSCAPE: a Framework for FAIR and Reproducible Biomedical Analytics
- Maxwell Adam Levinson
- Justin Niestroy
- Timothy Clark
Neuroinformatics (2022)
Rare coding variants in RCN3 are associated with blood pressure
- Karen Y. He
- Tanika N. Kelly
- Xiaofeng Zhu
BMC Genomics (2022)
Whole genome sequence association analysis of fasting glucose and fasting insulin levels in diverse cohorts from the NHLBI TOPMed program
- Daniel DiCorpo
- Sheila M. Gaynor
- Alisa K. Manning
Communications Biology (2022)