Defence of Kaido Lepik' PhD thesis

On 4 May at 14:15 Kaido Lepik will defend his doctoral thesis “Inferring causality between transcriptome and complex traits” for obtaining the degree of Doctor of Philosophy (in Computer Science). The defence will be held in Zoom

Sen. Res. Fellow Hedi Peterson, University of Tartu
Prof. Jaak Vilo, University of Tartu

Prof. Jack Bowden, University of Exeter (England)
Prof. Samuli Ripatti, University of Helsinki (Finland)

The “chicken and egg” dilemma in bioinformatics: genes cause disease or diseases cause changes in the function of genes?

A prerequisite in understanding and curing disease is the identification of genes active in disease processes – drugs could be developed to target the proteins encoded by such causal genes. The main standard in discovering causal relationships between traits is provided by lab experiments and randomized clinical trials but these can be time-consuming and expensive to undertake. In this dissertation, we show that functionally relevant genes in the development of diseases and other complex traits can be more effectively identified using statistical methods.

Causal statistical analysis in genetics has only recently been propelled by taking advantage of the vast amount of data collected by national biobanks. Due to the novelty and projected impact of the field, the corresponding mathematical theory is still evolving and rapidly so. We direct considerable attention to systematically introduce this theory and then further expand on it in practical applications.

We apply the principles of causal analysis to develop methodology for identifying causal genes in small samples (n ≈ 500), ascertaining the function of an inflammatory biomarker C-reactive protein in immune response. By utilizing domain knowledge, we create an algorithm – robust to the assumptions of causal models – for hypothesis-free identification of causal genes to arbitrary complex traits over the entire genome. Furthermore, we take an in-depth look into a specific disease-associated genomic region (16p11.2) and are able to pinpoint genes responsible for reproductive health. With respect to the personalized medicine movement, we study whether the causal genes differ between sexes. Finally, we hypothesize whether the popular association studies between gene expression and complex traits identify causal genes, disease-induced changes in gene expression or simply random noise. We validate our primary research results with lab experiments.