Multiclass Sequential FEATURE Selection And Classification Method For Gene Expression Data

  • W.B. Yahya
  • G.T. Aremu
  • M.K. Garba
Keywords: Support Vector Machines, Microarray, Misclassification Error Rate, Correct Classification Rate


A multiclass sequential feature selection and classification (mk-SS) method has been examined using gene expression signatures. The classification efficiency was assessed by both misclassification error rate and correct classification rate at 10-fold cross-validation to ensure stability. The performance of the mk-SS was compared with the classification by Support Vector Machines (SVM) for five multiclass microarray data sets. The mk-SS efficiently selected the informative gene biomarkers for proper classification of the biological groups of the tissue samples, and the method competed favourably with SVM in terms of prediction accuracy by outperforming the SVM in 80 % of the cases considered. The quality of the features selected by mk-SS algorithm was validated by hybridizing the feature selection mechanism into the standard SVM algorithm to develop the SVM-kSS classifier. The prediction performance of the SVM-kSS was better than that of SVM (confirming the quality of the features selected by the mk-SS method); but the predictive power of the mk-SS classifier was better than that of the SVMkSS method. Therefore, mk-SS classification of cancers of leukemia, breast, lung, ovarian colon, prostate, renal and other types, using gene expression profiles was feasible, especially when the endpoints are of multi-category.