RFQTL: Random Forest QTL Mapping
CDS members associated with the software: Prof. Dr. Andreas Beyer
The analysis of expression quantitative trait loci (eQTL) is a potentially powerful way to detect transcriptional regulatory relationships at the genomic scale. However, eQTL data sets often go underexploited because legacy QTL methods are used to map the relationship between the expression trait and genotype. Often these methods are inappropriate for complex traits such as gene expression, particularly in the case of epistasis.
We developed and evaluated QTL mapping methods using the Random Forest (RF) machine learning approach. RF is able to capture much more complex relationships between predictors (genetic markers) than legacy methods. Using simulation and real QTL data we could demonstrate in multiple publications that RFQTL outperforms methods assuming simpler genotype-phenotype relationships.
Related publications:
- Michaelson JJ, Alberts R, Schughart K, Beyer A. (2010) Data-driven assessment of eQTL mapping methods. BMC Genomics. 7;11:502.
- Ackermann M, Clément-Ziza M, Michaelson JJ, Beyer A. (2012) Teamwork: improved eQTL mapping using combinations of machine learning methods. PLoS One. 7(7):e40916.
- A random forest approach to capture genetic effects in the presence of population structure. Stephan J, Stegle O, Beyer A. Nat Commun. 2015 Jun 25;6:7432. doi: 10.1038/ncomms8432