iScience: Predicting and interpreting protein and phosphoprotein abundance from pan-cancer and single-cell transcriptomes (Chen Lab)

Hui-Mei Tsai^1,2,3,12 ∙ Tzu-Hung Hsiao^2,4,5,12 ∙ Yu-Chiao Chiu^6,7 ∙ Yufei Huang^6,7 ∙ Eric Y. Chuang^1,8,9,10 chuangey@ntu.edu.tw ∙ Yidong Chen^3,11,13 cheny8@uthscsa.edu

Highlights

• DeepGxP provides a framework to translate transcriptomes into proteomic insight

• DeepEnrich links RNA predictors to functional protein pathways and activities

• DeepEnrich reveals cancer type-specific EGFR and HER2 phosphorylation patterns

• Mutation effects are captured even without mutation data in model training

Summary

Proteins that impact phenotype and disease are often inferred from RNA expression, which poorly reflects protein abundance. We developed DeepGxP, a deep learning model trained on The Cancer Genome Atlas pan-cancer data, to predict protein abundance from transcriptomic profiles. DeepGxP outperformed conventional models, achieving a median Pearson’s correlation of 0.68 (n = 187) and predictive performance of 0.74 and 0.64 for proteins with high (≥0.31) and low (<0.31) self-gene/protein correlation, respectively. We also developed DeepEnrich, an integrated gradient-based interpretation framework that identifies predictor genes and enriched functions. For example, predictors of cyclin B1 and E2 are enriched in mitotic chromatid segregation and G2/M transition, respectively. In lung adenocarcinoma, we uncovered distinct EGFR/HER2 phosphorylation patterns in alveolar cells. In breast cancer, p53 protein, but not TP53 mRNA, correlated with survival. DeepGxP also accurately predicted the abundance of single-cell surface proteins, confirming cell identification. Our findings underscore DeepGxP’s potential in decoding gene-to-protein relationships for cancer biomarker discovery.

Read Full Text

Since 2004, UT Health San Antonio, Greehey Children’s Cancer Research Institute’s (Greehey CCRI) mission has been to advance scientific knowledge relevant to childhood cancer, contribute to the understanding of its causes, and accelerate the translation of knowledge into novel therapies. Through the discovery, development, and dissemination of new scientific knowledge, Greehey CCRI strives to have a national and global impact on childhood cancer. Our mission consists of three key areas: research, clinical, and education.

Stay connected with the Greehey CCRI on Facebook, Twitter, LinkedIn, and Instagram.