Bioinformatics

Greehey CCRI’s Computational Biology and Bioinformatics (CBBI) team, led by Dr. Yidong Chen, is designated to support the Genome Sequencing Facility.

Yidong Chen, Ph.D.
Office: GCCRI 4.100.06
Phone: (210) 562-9163
Fax: (210) 562-9135
Email: cheny8@uthscsa.edu

Our bioinformatics services are highly customizable, so we will work with you to analyze the NGS data and reports you are looking for.

Bioinformatics in NGS data analysis includes two major areas:

NGS data quality assurance and initial genome alignment
Customized NGS Analysis

NGS data quality assurance and initial genome alignment:

We have developed extensive tools to monitor sequence quality and accuracy; every sequencing run that is performed by the GSF is subjected to quality control evaluation in the form of a report that includes a review of reading output and overall quality metrics, including the Q30 score, percentage of undetermined reads, FastQC result, duplicate rate, mappable rate et al. These mechanisms allow GSF to maintain the highest level of sequence quality that simplifies subsequent analyses.

Customized NGS Analysis

CBBI can analyze almost all types of sequencing data generated by the Illumina HiSeq 2000 platform. These NGS data include ChIP-Seq, mRNA-Seq, small RNA-Seq, MBDCap-Seq, and exome-cap-Seq. The following is a list of bioinformatics capability examples for common NGS applications:

RNA-Seq:

CBBI’s RNA-Seq services include counts for all known mRNAs, differential expression analysis, heatmaps, and other standard RNA-Seq processing. Additionally, we can also provide intron-exon junction sites, non-coding RNA counts, SNPs within transcripts, and other tasks. The following files will be provided with your whole-transcriptome results:

Alignment report (total mappable reads, etc)
Alignment results (.BAM. optional .SAM file)
Counts file containing the number of reads matching annotated genes
Differential expression report (optional)
Functional analysis (optional)
Non-coding counts report (optional)
SNP report (optional)

ChIP-Seq:

For ChIP-Seq data, in addition to aligning the sequence to the reference (using Burrows-Wheeler Aligner, or BWA), the CBBI will further analyze your data using tools such as the Model-based Analysis of ChIP-Seq (MACS) and others to identify binding sites within the genome. Users can load the results onto the UCSC browser or IGV to view regions in the context of the genome. We will also assist users to use motif identification software such as the Motif-based sequence analysis suite (MEME) to discover common binding motifs. The example files that will be provided with your ChIP-Seq are:

Alignment file (.SAM or . BAM file)
Peaks file (in .BED format)
Peak annotation file
Binding peak characteristics (percent in promoter regions, intronic regions, intergenic regions).