Abstract Lung cancer is consistently classified as the most dangerous form of the disease since the beginning of recorded history. Patients with lung cancer who receive appropriate medical care, such as a low-dose CT scan, have a far better chance of survival since the disease is detected and diagnosed early. Nonetheless, there are certain drawbacks to this attempt. The gene expression level in hundreds of genes or cells within each tissue may now be determined because of developments in DNA microarray technology. Even though machine learning (ML) is rapidly being used in the medical field for lung cancer detection, the shortage of interpretability of these models remains a significant hurdle. Machine learning can be used to analyze gene expression data (DNA microarray) to predict whether or not a patient has lung cancer. The Collective Random Forest and Adaptive Boosting were employed to determine who was responsible for the harm. KPCA, or Kernel principal component analysis, was used for the feature reduction procedure. We calculated the correlation between each feature and the target using the statistical parameters provided by KPCA. Determining the proportion of the correct predictions for a given data set is one way to calculate the accuracy of a classification model. We tested the validity of the proposed technique in this work using a dataset including information about lung cancer. The dataset includes GSE4115 from the Gene Expression Omnibus (GEO) database, as well as the expression profiles it contains. The findings demonstrate the Identification of Lung Cancer (IOLC) model's potential to detect lung cancer in terms of accuracy, precision, recall, F-Measure, and error rate, with results indicating an accuracy of 81%, the precision of 81.2%, recall of 78.9%, F-Measure of 77.7%, and error rate of 0.29%, respectively.
Alan : Mühendislik
Dergi Türü : Uluslararası
Benzer Makaleler | Yazar | # |
---|
Makale | Yazar | # |
---|