Volume 1 Number 2 (Jun. 2009)
Home > Archive > 2009 > Volume 1 Number 2 (Jun. 2009) >
IJCEE 2009 Vol.1 (2): 155-164 ISSN: 1793-8163
DOI: 10.7763/IJCEE.2009.V1.24

Gene Expression Analysis Using Clustering

Kumar Dhiraj and Santanu Kumar Rath

Abstract—Data Mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. In this paper, k-means clustering algorithm has been extensively studied for gene expression analysis. Since our purpose is to demonstrate the effectiveness of the k-means algorithm for a wide variety of data sets, Two pattern recognition data and thirteen microarray data sets with both overlapping and non-overlapping class boundaries were taken for studies, where the number of features/genes ranges from 4 to 7129 and number of sample ranges from 32 to 683. The number of clusters ranges from two to eleven. For pattern recognition, We use IRIS and WBCD data and for microarray data we use serum data (Iyer et. al.), yeast data (Cho et. al), leukemia data (Golub et. al), breast data (Golub et. al), Lymphoma data (Alizadeh et al.), lung cancer (Bhattacharjee et. al), and St. Jude leukemia data (Yeoh et. al). To identify common subtypes in independent disease data, four different types of breast data (Golub et. al) and four Diffused Large B-cell Lymphoma (DLBCL) data were used. Clustering error rate (or, clustering accuracy) is used as evaluation metrics to measure the performance of k-means algorithm.

Index Terms—Bio-informatics, Cancer-Genomics, Clustering, Cluster validation, Data-mining, Gene-expression, K-means algorithm, Microarray.

Kumar Dhiraj is with the Computer Science and Engineering, National Institute Technology Rourkela, Orissa, 769008, INDIA. He is a Research scholar working in the area of Data mining and Bioinformatics. (Phone:+919853388520; fax: 0661-2464356;).
Santanu Kumar Rath is with the Computer Science and Engineering, National Institute Technology Rourkela, Orissa, 769008, INDIA. He is asenior professor in the NIT Rourkela INDIA.

Cite: Kumar Dhiraj and Santanu Kumar Rath, "Gene Expression Analysis Using Clustering," International Journal
of Computer and Electrical Engineering
vol. 1, no. 2, pp. 155-164, 2009.

General Information

ISSN: 1793-8163 (Print)
Abbreviated Title: Int. J. Comput. Electr. Eng.
Frequency: Quarterly
Editor-in-Chief: Prof. Yucong Duan
Abstracting/ Indexing: INSPEC, Ulrich's Periodicals Directory, Google Scholar, EBSCO, ProQuest, and Electronic Journals Library
E-mail: ijcee@iap.org

What's New

  • Jun 03, 2019 News!

    IJCEE Vol. 9, No. 2 - Vol. 10, No. 2 have been indexed by EI (Inspec) Inspec, created by the Institution of Engineering and Tech.!   [Click]

  • May 13, 2020 News!

    IJCEE Vol 12, No 2 is available online now   [Click]

  • Mar 04, 2020 News!

    IJCEE Vol 12, No 1 is available online now   [Click]

  • Dec 11, 2019 News!

    The dois of published papers in Vol 11, No 4 have been validated by Crossref

  • Oct 11, 2019 News!

    IJCEE Vol 11, No 4 is available online now   [Click]

  • Read more>>