Rare germline copy number variants (CNVs) and breast cancer risk

Institution: Department of Public Health and Primary Care, University of Cambridge
Corresponding Researcher: Joe Dennis
Email: jgd29@cam.ac.uk
Publication Link(s): https://doi.org/10.1038/s42003-021-02990-6
Data Link(s): The majority of the OncoArray dataset analysed in this study is available in the dbGap repository, Study ID: phs001265.v1.p1 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001265.v1.p1). The iCOGS dataset and complete OncoArray dataset cannot be made publicly available due to restraints imposed by the ethics committees of individual studies; requests for data can be made to the corresponding author or the Data Access Coordination Committee (DACC) of BCAC (http://bcac.ccge.medschl.cam.ac.uk/).
Keyword(s): germline, BRCA1, copy number variants, genome-wide association studies, European ancestry

Summary

Germline copy number variants (CNVs) are pervasive in the human genome but potential disease associations with rare CNVs have not been comprehensively assessed in large datasets. We analysed rare CNVs in genes and non-coding regions for 86,788 breast cancer cases and 76,122 controls of European ancestry with genome-wide array data. Gene burden tests detected the strongest association for deletions in BRCA1 (P = 3.7E-18). Nine other genes were associated with a p-value < 0.01 including known susceptibility genes CHEK2 (P = 0.0008), ATM (P = 0.002) and BRCA2 (P = 0.008). Outside the known genes we detected associations with p-values < 0.001 for either overall or subtype-specific breast cancer at nine deletion regions and four duplication regions. Three of the deletion regions were in established common susceptibility loci. To the best of our knowledge, this is the first genome-wide analysis of rare CNVs in a large breast cancer case-control dataset. We detected associations with exonic deletions in established breast cancer susceptibility genes. We also detected suggestive associations with non-coding CNVs in known and novel loci with large effects sizes. Larger sample sizes will be required to reach robust levels of statistical significance.