Title | : | A method for non-linear causal discovery |
Speaker | : | Aravind Easwar (IITM) |
Details | : | Thu, 27 Jul, 2023 4:00 PM @ MR - I (SSB 233) |
Abstract: | : | Testing if two correlated variables are causally related is a fundamental problem in many sciences, including biological science. Addressing this problem requires separating causality from confounding using data from interventions (e.g., randomized controlled trials), or applying mediation tests on data observed in the absence of interventions. Statistical tests of mediation or conditional independence within the Mendelian Randomization (MR) framework allows us to infer causal relations between two variables that are each associated with a third instrument variable (e.g., two gene expression or clinical traits A, B associated with a genetic variant L, with all variables observed in the same population). Most existing MR methods determine the causal direction and effect assuming a linear relationship between the traits. We propose a method NLCD for Non-Linear Causal Discovery that extends a linear causal discovery technique. NLCD is based on non-linear regression modeling and conditional feature importance scores. We show that NLCD also handles scenarios where the variance of trait is not equal across the genotype values. In comparison to a baseline linear causal discovery method on simulated data, our NLCD performs significantly better for traits with non-linear relations across varying sample sizes, and also in unequal variance cases. We also perform comparative evaluation of NLCD using yeast genomic data. In application of NLCD to a human genomic data (specifically Genotype-Tissue Expression (GTEx) skeletal muscle tissue data), we discovered a human muscle gene network which involves transcription factors and pseudogenes supported by current literature. |