Interdisciplinary Bio Central
Etc. (Bioinformatics/Computational biology/Molecular modeling)

Statistical Analysis for the Feature Subset Selection in a cDNA Microarray using Hotelling's T square Statistic
Inyoung Kim1, Sunho Lee2, Sang-Cheol Kim3 and Byung Soo Kim3,*
1Department of Epidemiology and Public Health, Yale University, New Haven, CT 06520 - 8034, USA
2Dept. of Applied Mathematics, Sejong University, Seoul, 143 - 747, Korea
3Dept. of Applied Statistics, Yonsei University, Seoul, 120 - 749, Korea
*Corresponding author
  Published : August 31, 2006
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Main text PDF(342.KB)
(pre-print version)

In this paper we propose using Hotelling's T2 statistic for the detection of a set of differentially expressed genes in colorectal cancer based on the microarray-based gene expression levels of tumors compared with normal tissues, and to evaluate its predictivity which allows us to rank genes for the development of biomarkers for population screening of colorectal cancer. We compared the prediction rate based on the differentially expressed genes selected by Hotelling's T2 statistic and several univariate statistics including the t statistic. We used various classification methods including a regularized discrimination analysis and a support vector machine. The result shows that Hotelling's T2 statistic performs better than several univariate statistics in terms of the prediction rate. This implies that it may not be sufficient to look at each gene in a separate identity and that evaluating correlation among genes reveals interesting information that will not be discovered otherwise.

Keyword: cDNA Microarray, Feature Subset Selection
IBC   ISSN : 2005-8543   Contact IBC