Statistical Challenges in Genetic Association Studies
It is now well-recognized that the genetic architecture of most common
diseases, such as cancer, autism, schizophrenia, is complex, with many
different genes influencing risk to such diseases. The most widely-used
approach to identify genetic factors underlying complex diseases has been
through genome-wide association studies. With the recent progress in
massively parallel sequencing technologies, sequence-based association
studies have become more affordable, and are likely to contribute
significantly to our current knowledge of complex disease genetics. A
potential limitation of both genome-wide and sequence-based association
studies is that they focus on one variant or one gene at a time, and do not
take advantage of existing biological knowledge. We will discuss the basic
of genome-wide and sequence-based association studies, and statistical
methodology that incorporates prior information on biological networks to
(1) improve the power to identify disease related genes, and (2) identify
new pathways involved in disease. The methodology will be illustrated using
relevant examples from genetic studies of complex traits.
References
1. Network-constrained regularization and variable selection for analysis of
genomic data. http://www.ncbi.nlm.nih.gov/pubmed/18310618
Li C, Li H.
Bioinformatics. 2008 May 1;24(9):1175-82. doi:
10.1093/bioinformatics/btn081. Epub 2008 Mar 1.
2. Five years of GWAS discovery. http://www.ncbi.nlm.nih.gov/pubmed/22243964
Visscher PM, Brown MA, McCarthy MI, Yang J.
Am J Hum Genet. 2012 Jan 13;90(1):7-24. doi: 10.1016/j.ajhg.2011.11.029.
Review.
3. Estimation and testing for the effect of a genetic pathway on a disease
outcome using logistic kernel machine regression via logistic mixed models. http://www.ncbi.nlm.nih.gov/pubmed/18577223
Liu D, Ghosh D, Lin X.
BMC Bioinformatics. 2008 Jun 24;9:292. doi: 10.1186/1471-2105-9-292.
4. Optimal tests for rare variant effects in sequencing association studies. http://www.ncbi.nlm.nih.gov/pubmed/22699862
Lee S, Wu MC, Lin X.
Biostatistics. 2012 Sep;13(4):762-75. doi: 10.1093/biostatistics/kxs014.
Epub 2012 Jun 14.
5. Scan-statistic approach identifies clusters of rare disease variants in
LRP2, a gene linked and associated with autism spectrum disorders, in three
datasets. http://www.ncbi.nlm.nih.gov/pubmed/22578327
Ionita-Laza I, Makarov V; ARRA Autism Sequencing Consortium, Buxbaum JD.
Am J Hum Genet. 2012 Jun 8;90(6):1002-13. doi: 10.1016/j.ajhg.2012.04.010.
Epub 2012 May 10.
6. Incorporating network structure in integrative analysis of cancer
prognosis data. http://www.ncbi.nlm.nih.gov/pubmed/23161517
Liu J, Huang J, Ma S.
Genet Epidemiol. 2013 Feb;37(2):173-83. doi: 10.1002/gepi.21697. Epub 2012
Nov 17.