Loh Lab

New preprint on hidden protein-altering variants with strong influences on common diseases

We are excited to share a new preprint, “Hidden protein-altering variants influence diverse human phenotypes” (Hujoel et al.), which finds that some of the largest effects of genetic variation on human phenotypes arise from protein-coding variants in previously-hidden structural polymorphisms. Structural variants (SVs) modify 50bp to megabases of DNA, yet are difficult to detect from high-throughput DNA sequencing, such that their impact on common human phenotypes is mostly unknown. We developed and applied a computational strategy to extract information about protein-altering SVs from abundant exome-sequencing and SNP-array data in UK Biobank. These analyses uncovered many protein-coding SVs predicted to cause loss of gene function, including a low-frequency partial deletion of RGL3 exon 6 that confers one of the strongest protective effects on hypertension of all coding variants in the human genome (OR = 0.86 [0.82–0.90]). Protein-coding variation in rapidly-evolving gene families within segmental duplications—representing >100kb gaps in genetic association analyses—generates some of the human genome’s largest contributions to variation in type 2 diabetes risk, chronotype, and blood cell traits. In particular, copy number of a segmentally duplicated missense variant in RASA4 appears to strongly influence T2D risk (1.3-fold [1.2–1.4] range) and drive the strongest association with chronotype genome-wide. These results illustrate the potential for new genetic insights from genomic variation that has escaped large-scale analysis to date.