Cancers MDPI: Transformer for Gene Expression Modeling (T-GEM): An Interpretable Deep Learning Model for Gene Expression-Based Phenotype Predictions (Chen Lab)

Ting-He Zhang, Md Musaddaqul Hasib, Yu-Chiao Chiu, Zhi-Feng Han, Yu-Fang Jin, Mario Flores, Yidong Chen and Yufei Huang

Simple Summary

Cancer is the second leading cause of death worldwide. Predicting phenotype and understanding makers that define the phenotype are important tasks. We propose an interpretable deep learning model called T-GEM that can predict cancer-related phenotype prediction and reveal phenotype-related biological functions and marker genes. We demonstrated the capability of T-GEM on cancer-type prediction using TGCA data and immune cell type identification using scRNA-seq data. The code and detailed documents are provided to facilitate easy implementation of the model in other studies.

Abstract

Deep learning has been applied in precision oncology to address a variety of gene expression-based phenotype predictions. However, gene expression data’s unique characteristics challenge the computer vision-inspired design of popular Deep Learning (DL) models such as Convolutional Neural Network (CNN) and ask for the need to develop interpretable DL models tailored for transcriptomics study. To address the current challenges in developing an interpretable DL model for modeling gene expression data, we propose a novel interpretable deep learning architecture called T-GEM, or Transformer for Gene Expression Modeling. We provided the detailed T-GEM model for modeling gene-gene interactions and demonstrated its utility for gene expression-based predictions of cancer-related phenotypes, including cancer type prediction and immune cell type classification. We carefully analyzed the learning mechanism of T-GEM and showed that the first layer has broader attention while higher layers focus more on phenotype-related genes. We also showed that T-GEM’s self-attention could capture important biological functions associated with the predicted phenotypes. We further devised a method to extract the regulatory network that T-GEM learns by exploiting the attributions of self-attention weights for classifications. We also showed that the network hub genes were likely markers for the predicted phenotypes.

Read Full Text

_________________________________________________________

Since 2004, UT Health San Antonio, Greehey Children’s Cancer Research Institute’s (Greehey CCRI) mission has been to advance scientific knowledge relevant to childhood cancer, contribute to understanding its causes, and accelerate the translation of knowledge into novel therapies. Greehey CCRI strives to have a national and global impact on childhood cancer by discovering, developing, and disseminating new scientific knowledge. Our mission consists of three key areas — research, clinical, and education.

Stay connected with the Greehey CCRI on Facebook, Twitter, LinkedIn, and Instagram.

Since 2004, UT Health San Antonio, Greehey Children’s Cancer Research Institute’s (Greehey CCRI) mission has been to advance scientific knowledge relevant to childhood cancer, contribute to the understanding of its causes, and accelerate the translation of knowledge into novel therapies. Through the discovery, development, and dissemination of new scientific knowledge, Greehey CCRI strives to have a national and global impact on childhood cancer. Our mission consists of three key areas: research, clinical, and education.

Stay connected with the Greehey CCRI on Facebook, Twitter, LinkedIn, and Instagram.