Lung Cancer stage classification using random forest and artificial neural network

No Thumbnail Available
Date
2019
Authors
Stanley T. Dalagan
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Lung cancer is considered the most common and the deadliest cancer type. The survival of lung cancer patient is affected by a variety of factors including lung cancer stage or the measure of the spread of cancer within the body. To analyze the occurrence of the four lung cancer stages, Random Forest (RF) and Artificial Neural Network (ANN) were used to develop classification mode. The predictor variables used are age at diagnosis, average number of cigarettes costumed per day, number of years smoked/ smoking, primary diagnosis, site of origin, gender, viral status and year of birth. Both methods can be used to predict values of an ordinal dependent variable. From the model feature selection, RF with 100 decision trees with 2 factors per tree (or mtry) was used. The final ANN model for lung tumor stage classification was a 17-1-4 network. The ANN model was found to be better at classifying lung cancer stage than the RF model based on accuracy rate and Kappa statistic.
Description
Keywords
Citation
Collections