Interdisciplinary Bio Central
 
Full Report (Bioinformatics/Computational biology/Molecular modeling)

Improved Statistical Testing of Two-Class Microarrays with a Robust Statistical Approach
Hee-Seok Oh1, Dongik Jang1, Seungyoon Oh2 and heebal Kim2,3,*
1Department of Statistics, Seoul National University, Seoul 151-742, Korea
2Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 151-742, Korea
3Department of Agricultural Biotechnology, Seoul National University, Seoul 151-742, Korea
*Corresponding author
  Received : March 17, 2010
  Revised : April 30, 2010
  Accepted : May 31, 2010
  Published : June 01, 2010
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Main text PDF(2205KB)
   (Print version)
Synopsis

The most common type of microarray experiment has a simple design using microarray data obtained from two different groups or conditions. A typical method to identify differentially expressed genes (DEGs) between two conditions is the conventional Student's t-test. The t-test is based on the simple estimation of the population variance for a gene using the sample variance of its expression levels. Although empirical Bayes approach improves on the t-statistic by not giving a high rank to genes only because they have a small sample variance, the basic assumption for this is same as the ordinary t-test which is the equality of variances across experimental groups . The t-test and empirical Bayes approach suffer from low statistical power because of the assumption of normal and unimodal distributions for the microarray data analysis. We propose a method to address these problems that is robust to outliers or skewed data, while maintaining the advantages of the classical t-test or modified t-statistics. The resulting data transformation to fit the normality assumption increases the statistical power for identifying DEGs using these statistics.

Keyword: microarray, t-test, empirical Bayes, pseudo data
IBC   ISSN : 2005-8543   Contact IBC