Event Detail

Event Type: 
Mathematical Biology Seminar
Date/Time: 
Wednesday, June 5, 2019 - 12:00 to 12:50
Location: 
Kidder 237

Speaker Info

Institution: 
Oregon State University, Microbiology
Abstract: 

Large-scale microbiome studies looking to understand microbial roles and disease-inducing dysbiosis base their findings on differences between microbial community structures in contrasting environments (healthy vs. diseased tissues). Such efforts are impeded by the curse of dimensionality, whereby biological effect is obscured beneath mounds of confounding variables (small n large p problem). Here, we use NLP-inspired embedding algorithms and large public microbiome sequencing datasets to reduce the dimensionality of query datasets in a biologically informed way. We show that a random forest model trained on the new feature space produces more accurate and robust predictions of IBD than state-of-the-art data transformations. Additionally, we find strong correlations between dimensions in the reduced feature space and annotated metabolic pathways.