ReliefF and Association Rule Mining to Determine Cervical Cancer Causes
DOI:
https://doi.org/10.29304/jqcm.2021.13.3.823Keywords:
Apriori algorithm, Association rules mining, cervical cancer, cervical cancer causes, ReliefFAbstract
Cancer patients till this day suffer from the inability of science to predict the causes of the disease before it occurs. One of the cancers that occupy the minds of many women is the cervical cancer because of the delay in its diagnosis as a result of its multiple and unclear causes, so scientists and researchers need to search for the most causative factors. Machine learning approaches have become one of the best and fastest ways to find associations between symptoms and causes of disease. The use of association rule mining (AR) is very effective if diagnostic features are set up. In this work, feature selection (FS) algorithm named ReliefF is used to reach the most correlated factor, then the Apriori algorithm has been updated to reduce the time and space used, and detects features that are closely related to the class attribute to access most factors that cause cervical cancer. The experimental results of the proposed work indicate a number of cervical cancer risk factors that when combined, indicate a woman's likelihood of developing cervical cancer, which is: the number of years of hormonal contraception is greater than or equal to 15, having any type of cancer or HPV or syphilis or HIV, the number of IUD insertion years exceeded 10, First sexual intercourse smaller than 13 and Number of sexual partners greater than 5. The outcomes of this work help both doctors and women to prevent cancer.
Downloads
References
[2] Rustum Mohammed, H. , “Boundaries Object Detection for Skin Cancer Image using Gray-Level Co-Occurrence Matrix (GLCM) and features points”, Journal of Al-Qadisiyah for Computer Science and Mathematics, vol. 7, No. 1, (2017), pp. 160-172. Retrieved from https://qu.edu.iq/journalcm/index.php/journalcm/article/view/94
[3] Ryan J. Urbanowicz, M. Meekerb, W. La Cavaa, R. S. Olsona and J. H. Moorea, “Relief-based feature selection: Introduction and review”, Journal of Biomedical Informatics, vol. 85, (2018), pp. 189-203. https://doi.org/10.1016/j.jbi.2018.07.014.
[4] R. Paul, T. Groza, J. Hunter and A. Zankl, “Semantic interestingness measures for discovering association rules in the skeletal dysplasia domain”, J Biomed Semantics, vol. 5, (2014). doi: 10.1186/2041-1480-5-8.
[5] D. Azzeddine, S. Jabri, B. Yousse and G. Taoufiq, “The selection of the relevant association rules using the ELECTRE method with multiple criteria”, IAES International Journal of Artificial Intelligence (IJ-AI), Vol. 9, No. 4, Des (2020). doi: 10.11591/ijai.v9.i4.pp638-645.
[6] URL:https://www.who.int/health-topics/cervical-cancer#tab=tab_1
[7] E. Ratnasari Putri, A. Vijai Nasrulloh and A. E. Fahrudin, “Coloring of Cervical Cancer’s CT Images to Localize Cervical Cancer”, International Journal of Electrical and Computer Engineering (IJECE), vol. 5, No. 2, April (2015), pp. 304-310. doi: http://doi.org/10.11591/ijece.v5i2.
[8] Mohammed, Y., & Saleh, E., “Investigating the Applicability of Logistic Regression and Artificial Neural Networks in Predicting Breast Cancer”, Journal of Al-Qadisiyah for Computer Science and Mathematics, vol. 12, No. 2, (2020), pp. 63-73. https://doi.org/10.29304/jqcm.2020.12.2.697.
[9] . Nahar, K. S. Tickle, A. B. M. Shawkat Ali and Y. Phoebe Chen, “Significant Cancer Prevention Factor Extraction: An Association Rule Discovery Approach”, J Med Syst, vol. 35, (2011), pp. 353–367. doi: 10.1007/s10916-009-9372-8.
[10] Y. M. S. Al-Wesabi, A. Choudhury, D. Won, “Classification of Cervical Cancer Dataset”, In: proc. of the 2018 IISE Annual Conference, (2018). doi: 10.13140/RG.2.2.32311.78245.
[11] J. Lu, E. Song, A. Ghoneim and M. Alrashoud, Machine learning for assisting cervical cancer diagnosis: An ensemble approach, Future Generation Computer Systems, Elsevier B.V., (2019). doi: 10.1016/j.future.2019.12.033.
[12] K. Logeswaran, P. Suresh, S. Savitha , K. R. Prasanna Kumar, A. P. Ponselvakumar and A. Rajiv Kannan, “Data Driven Diagnosis of Cervical Cancer using Association Rule Mining with Trivial Rule Expulsion Approach”, International Journal on Emerging Technologies, Vol.11, No.2,(2020), pp. 110–115.
[13] M. Tandan, Y. Acharya, S. Pokharel and M. Timilsina, “Discovering symptom patterns of COVID-19 patients using association rule mining”, Computers in biology and medicine, vol. 131, Feb. (2021). doi: 10.1016/j.compbiomed.2021.104249.
[14] Y. Huang, P. J. McCullagh, N. D. Black, “An optimization of ReliefF for classification in large datasets” ,Data & Knowledge Engineering,Vol. 68, No. 11, (2009), pp. 1348-1356. https://doi.org/10.1016/j.datak.2009.07.011.
[15] D. Jain and V. Singh, “An Efficient Hybrid Feature Selection model for Dimensionality Reduction”, In: Proc. International Conference on Computational Intelligence and Data Science (ICCIDS 2018), (2018), pp. 333–341. https://doi.org/10.1016/j.procs.2018.05.188.
[16] Wang, Z. Yan Zhang, Z. Chen, and et al, “Application of ReliefF algorithm to selecting feature sets for classification of high resolution remote sensing image”, In: Proc. 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), (2016), pp. 755-758. doi: 10.1109/IGARSS.2016.7729190.
[17] A. Meida, D. Palupi Rini and Sukem, “Pattern of E-marketplace Customer Shopping Behavior using Improved Tabu Search and FP-Growth Algorithm”, Indonesian Journal of Electrical Engineering and Informatics (IJEEI), Vol. 7, No.4, Des (2019). doi: 10.52549/ijeei.v7i4.1362.
[18] S. SulovaT, “Association Rule Mining for Improvement of IT Project Management”, TEM Journal. Vo. 7, No. 4, November (2018), pp. 717-722. doi: 10.18421/TEM74-03.
[19] T. Tin Yu and K. Thidar Lynn, “Proposed Method for Modified Apriori Algorithm”, In: Proc. Information and Knowledge Engineering | IKE'17, (2017), pp. 68-73.
[20] M. Abdullah Al-Hagery, “Extracting hidden patterns from dates' product data using a machine learning technique”, IAES International Journal of Artificial Intelligence (IJ-AI), Vol. 8, No. 3, September (2019), pp. 205–214. doi: 10.11591/ijai.v8.i3.pp205-214.
[21] K. Fernandes, J. S. Cardoso, and J. Fernandes, “Transfer learning with partial observability applied to cervical cancer screening”, In: Proc. Iberian Conf. Pattern Recognit. Image Anal., (2017), pp. 243_250.
[22] Adil Raheem, O., & Saleh Alomari, E., “An Adaptive Intrusion Detection System by using Decision Tree”, Journal of Al-Qadisiyah for Computer Science and Mathematics, vol. 10, No. 2, (2018 ), pp. 88 - 96. https://doi.org/10.29304/jqcm.2018.10.2.387.
[23] R. F. Woolson and W. R. Clarke, Statistical methods for the analysis of biomedical data, 2nd Edition, John Wiley & Sons, (2002).
[24] Z. Naser Shahweli, “Deep Belief Network for Predicting the Predisposition to Lung Cancer in TP53 Gene”, Iraqi Journal of Science, Vol. 61, No. 1, (2020), pp. 171-177. doi: 10.24996/ijs.2020.61.1.19
[25] D. Singh and B. Singh, “Investigating the impact of data normalization on classification performance”, Applied Soft Computing Journal, vol. 78, (2019).
[26] H. He, Y. Bai, E. A. Garcia and S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning”, In: Proc. 2008 IEEE International Joint Conference on Neural Networks, Hong Kong, (2008), pp. 1322-1328.
[27] Arif-Ul-Islam, S. H. Ripon, N. Qaisar Bhuiyan, “Cervical cancer risk factors: classification and mining associations”, APTIKOM Journal on Computer Science and Information Technologies, Vol. 4, No. 1, (2019), pp. 8-18. doi: 10.11591/APTIKOM.J.CSIT.131.