Inamul Hasan Madar, Wonyeop Lee, Xiaojing Wang, Seung-Ik Ko, Hokeun Kim, Dong-Gi Mun, Bing Zhang, Eunok Paek, Sang-Won Lee
Proteogenomics provides opportunities for proteomic validation of gene structures, genomic alterations, and functional relevance of novel findings obtained from genomic data analysis. However, for effective proteogenomic data integration, an extensive proteome profiling, approaching the gene coverage of genomics data, is critical. Here we developed a multi-stage database search method for comprehensive proteomics data analysis to complement whole transcriptome sequencing data. The method utilizes two complementary database search engines, MS-GF+ and MODa/MODi, in tandem. The MS/MS data were first subjected to MS-GF+ database search (1st stage search) and the unidentified MS/MS data from the 1st stage search were subsequently analyzed with the combined use of MODa and MODi (2nd stage search), tools for blind and unrestrictive modification search, respectively. When combined with m …