To search, Click below search items.


All Published Papers Search Service


Prediction Model for Gastric Cancer via Class Balancing Techniques


Danish Jamil, Sellappan Palaniappan, Sanjoy Kumar Debnath, Muhammad Naseem, Susama Bagchi, and Asiah Lokman


Vol. 23  No. 1  pp. 53-63


Many researchers are trying hard to minimize the incidence of cancers, mainly Gastric Cancer (GC). For GC, the five-year survival rate is generally 5?25%, but for Early Gastric Cancer (EGC), it is almost 90%. Predicting the onset of stomach cancer based on risk factors will allow for an early diagnosis and more effective treatment. Although there are several models for predicting stomach cancer, most of these models are based on unbalanced datasets, which favours the majority class. However, it is imperative to correctly identify cancer patients who are in the minority class. This research aims to apply three class-balancing approaches to the NHS dataset before developing supervised learning strategies: Oversampling (Synthetic Minority Oversampling Technique or SMOTE), Undersampling (SpreadSubsample), and Hybrid System (SMOTE + SpreadSubsample). This study uses Naive Bayes, Bayesian Network, Random Forest, and Decision Tree (C4.5) methods. We measured these classifiers' efficacy using their Receiver Operating Characteristics (ROC) curves, sensitivity, and specificity. The validation data was used to test several ways of balancing the classifiers. The final prediction model was built on the one that did the best overall.


Gastric Cancer, Imbalance dataset, Synthetic Minority Oversampling Technique, Receiver Operating Characteristics (ROC)