Chifeng Ma, Konduru S. Sastry, Mario Flore, Salah Gehani, Issam Al-Bozom, Yusheng Feng, Erchin Serpedin, Lotfi Chouchane, Yidong Chen & Yufei Huang
Abstract
Background
We considered the prediction of cancer classes (e.g. subtypes) using patient gene expression profiles that contain both systematic and condition-specific biases when compared with the training reference dataset. The conventional normalization-based approaches cannot guarantee that the gene signatures in the reference and prediction datasets always have the same distribution for all different conditions as the class-specific gene signatures change with the condition. Therefore, the trained classifier would work well under one condition but not under another.

