Integration of Bioinformatics Data Using Deep Learning
Tuesday, September 15, 2020 — 2:45PM - 3:30PM
Much data exists to answer certain research questions, but the data takes different forms or modalities and originates from different sources rendering it unusable in its present state. Therefore, there is a need to integrate the data. We are working on long non-coding ribonucleic acids (lncRNAs) which are important because they are implicated in many diseases such as Alzheimer’s disease (AD). The motivation for our research is the fact that work has been done to shed more light on lncRNAs in the wet lab, but wet lab work while highly accurate is expensive and time-consuming. A computational model such as ours would be faster and less costly. We aim to discover how patterns of lncRNAs differ between individuals with and without AD. We also aim to integrate different data modalities about AD including lncRNAs and messenger RNA sequencing data the latter sourced from single-cell RNA sequencing methodology. Thereafter, given a lncRNA pattern from an individual we aim to predict if or not AD is present. To accomplish this task, a convolutional neural network (CNN) deep learning model is proposed whose input will be labeled and unlabeled data from individuals with and without AD and whose output will be the above prediction.