Skip to main content

1000 Genomes Phase 3 WGS

We have collected and processed the complete 1000G Phase 3 WGS dataset for core and public use.

Quick data access:

  • Project: A1290-220303-VAR-Masked_1KG_Phase3_WGS
  • Mosaic
  • Ubox
  • UCGD Data path: /scratch/ucgd/lustre/UCGD_Datahub/IRBs/proj_UCGDCollab/A1290-220303-VAR-Masked_1KG_Phase3_WGS/UCGD

Data collection

Following this documentation page the 1000G Phase3 data was downloaded using awscli and completely recalled using the UCGD variant calling pipeline from original sequence reads.

List of available data

The below data is available for all 3,202 individuals unless otherwise noted.

FileProcessing Level
CRAMIndividual
GVCFsIndividual
MantaIndividual
SmooveIndividual
alignstatsIndividual (JSON)

Please note: CRAM file retention is NOT guaranteed and files could be removed at any time do to space consideration. If you plan to process these files, or have questions, please let us know.

Note

  • This data has currently only been called to GRCh38.
  • This data was generated using the masked version of GRCh38 (for questions, please contact us).