.

ISSN 2063-5346
For urgent queries please contact : +918130348310

Optimized Diagnosis of Central Nervous System (CNS) Cancer using Gene Expression Microarray & Machine Learning (ML) Methods

Main Article Content

Deepak Painuli*1, Suyash Bhardwaj2, Utku Kose3
» doi: 10.48047/ecb/2023.12.10.690

Abstract

Central nervous system (CNS) cancer is among the top 11 causes of cancer-related fatalities worldwide. Classification and early diagnosis of various tumor forms are crucial in CNS cancer analysis to protect patients from mortality. Conventional diagnostic methods, could be subject to high misdiagnosis rate due to inter & intra-observability variations observed in human interventions during diagnosis process. Higher efficiency & lower error rate of machine learning (ML) methods on complex & high dimensional data problems makes ML methods suitable choice for gene expression based diagnosis of CNS cancer. By analyzing the gene expression data, this study’s primary goal is to demonstrate the CNS cancer diagnosis using various ML models using efficient feature selection (FS) methods named recursive feature elimination (RFE) & maximum relevance - minimum redundancy (mRMR) and model optimization by grid search method. This study investigates 12 ML models i.e., Logistic Regression (LR), SVM (Linear/RBF), K-nearest Neighbor (KNN), Naive Bayes (NB), Decision-Tree (DT), Random Forest (RF), Extra Tree (ET), Gradient Boost (GbBoost), Extreme Gradient Boost (XgBoost), Adaptive Boost (AdaBoost) and Multi-layer Perceptron (MLP), for CNS cancer diagnosis task to obtain best ML model to accurately classify CNS cancer subjects using CNS cancer gene expression dataset. Experimental and comparative study of previously held research’s results demonstrated that LR-based model along outperformed all other applied models with classification accuracy of 99.6%, precision of 0.99, recall of 0.98, F1-score of 0.99 and AUC-Score of 1.0 on RFE-100 feature subset, which observed to be highest among various ML models employed on similar gene expression dataset in recent past.

Article Details