Models in Review for the Analysis of Phishing Website URLs

Authors

  • Ali Salam Al-jaberi College of Computer Sciences and Information Technology , University of Al-Qadisiyah , Al-Qadisiyah, Iraq
  • Sura Fadhil Rahman Computer Techniques Engineering , Imam AL-Kadhun College , Al-Qadisiyah
  • Ihsan Faisal Raheem Nizam college , Osmania University , Al-Qadisiyah , Iraq

DOI:

https://doi.org/10.29304/jqcsm.2024.16.31641

Keywords:

Machine Learning, Phishing, Websites, XGBoost, URLs, Cybersecurity.

Abstract

In this paper, we compare our method on DEFRAUD with current online API services and the state-of-the-art machine learning model for defending against phishing websites. Rule-based methods, in nature traditional such rules tend to get outdated quickly and are not capable of tracing new tactics used by the malicious ware every second. Over time, new proactive approaches enabled by machine learning (ML) have become more important as these solutions are flexible and adaptable in their ability to scan through modern data breaches for patterns from millions of datasets. In this study, we explore various machine learning algorithms: Logistic Regression (LR), K-Nearest Neighbors (KNN), Decision Trees (DT) Random Forest (RF), Support Vector Classifiers (SVC) and xgBoost for phishing website detection. Ensemble Methods like Random Forest, XGBoost have better accuracy/precision/recall. Metrics. While XGBoost is resource hungry, it is well known for out of the box support with huge data dimensions as well deep learning framework and avoiding overfitting. The study underscores the importance of integrating machine-learning models into practical cybersecurity applications. Future research should focus on improving these models and expanding their application across different domains to enhance cybersecurity defenses.

Downloads

Download data is not yet available.

References

A. Mandadi, S. Boppana, V. Ravella and R. Kavitha, “Phishing website detection using machine learning,” in 2022 IEEE 7th Int. Conf. for Convergence in Technology (I2CT), Mumbai, India, pp. 1–4, 2022. https:// doi.org/10.1109/i2ct54291.2022.9824801

S. Kuraku and D. Kalla, “Emotet malware—A banking credentials stealer,” IOSR Journal of Computer Engineering, vol. 22, pp. 31–41, 2020.

A. Kulkarni and L. L. Brown, “Phishing websites detection using machine learning,” International Journal

of Advanced Computer Science and Applications, vol. 10, 2019.https://doi.org/10.14569/ijacsa.2019.0100702

D. Kalla and A. Chandrasekaran, “Heart disease prediction using machine learning and deep learning,” International Journal of Data Mining & Knowledge Management Process (IJDKP), vol. 13, no. 3, 2023. https://doi.org/10.5121/ijdkp.2023.13301

A. Safi and S. Singh, “A systematic literature review on phishing website detection techniques,” Journal of King Saud University—Computer and Information Sciences, 2023. https://doi.org/10.1016/j.

jksuci.2023.01.004

S. Das Guptta, K. T. Shahriar, H. Alqahtani, D. Alsalman and I. H. Sarker, “Modeling hybrid feature- based phishing websites detection using machine learning techniques,” Annals of Data Science, 2022.https:// doi.org/10.1007/s40745-022-00379-8

D. Kalla, F. Samaah, S. Kuraku and N. Smith, “Phishing detection implementation using databricks and artificial Intelligence,” International Journal of Computer Applications, vol. 185, no. 11, pp. 1–11, 2023. https://doi.org/10.5120/ijca2023922764

Nadkarni, P. M., Ohno-Machado, L., & Chapman, W. W. (2011). Natural language processing: an introduction. Journal of the American Medical Informatics Association, 18(5), 544-551.‏

Azeez, N., Awotunde, O., & Oladeji, F. (2020). Approach for Identifying Phishing Uniform Resource Locators (URLs). Covenant Journal of Informatics and Communication Technology.‏

P. Gupta and A. Mahajan, “Phishing website detection and prevention based on logistic regression,” International Journal of Creative Research Thoughts, vol. 10, pp. 2320–2882, 2022.

T. A. Assegie, “K-nearest neighbor based URL identification model for phishing attack detection,” Indian Journal of Artificial Intelligence and Neural Networking, vol. 1, no. 2, pp. 18–21, 2021. https://doi.org/10.54105/ijainn.b1019.041221

D. Ahmed, K. Hussein, H. Abed and A. Abed, “Phishing websites detection model based on decision tree algorithm and best feature selection method,” Turkish Journal of Computer and Mathematics Education, vol. 13, no. 1, pp. 100–107, 2022

G. Ramesh, R. Lokitha, R. Monisha and N. Neha, “Phishing detection system using random forest algorithm,” International Journal for Research Trends and Innovation, vol. 8, pp. 510, 2023.

D. Aksu, A. Abdulwakil and M. A. Aydin, “Detecting phishing websites using support vector machine algo-rithm,” Pressacademia, vol. 5, no. 1, pp. 139–142, 2017.https://doi.org/10.17261/pressacademia.2017.582

V. Jakkula, “Tutorial on support vector machine (SVM),”2011. [Online]. Available: https://course.ccs.neu.edu/cs5100f11/resources/jakkula.pdf (accessed on 15/04/2023)

Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).‏

G. Kamal and M. Manna, “Detection of phishing websites using Naïve bayes algorithms,” International Journal of Recent Research and Review, vol. XI, no. 4, pp. 34–38, 2018.

F. Mbachan, “Phishing URL prediction using logistic regression,” 2022. https://doi.org/10.13140/RG.2.2.11606.93767

H. Rajaguru and S. R. Sannasi Chakravarthy, “Analysis of decision tree and K-nearest neighbor algorithm in the classification of breast cancer,” Asian Pacific Journal of Cancer Prevention, vol. 20, no. 12, pp. 3777–3781, 2019. https://doi.org/10.31557/APJCP.2019.20.12.3777

Musa, H., Gital, A. Y., Zambuk, F. U., Umar, A., Umar, A. Y., & Waziri, J. U. (2019). A comparative analysis of phishing website detection using XGBOOST algorithm. Journal of Theoretical and Applied Information Technology, 97(5), 1434-1443.‏ https://www.researchgate.net/publication/333134242_A_comparative_analysis_of_phishing_website_detection_using_XGBOOST_algorithm

Naik, N. N. (2021). Modelling Enhanced Phishing detection using XGBoost (Doctoral dissertation, Dublin, National College of Ireland).‏

Goud, N. S., & Mathur, A. (2021). Feature Engineering Framework to detect Phishing Websites using URL Analysis. International Journal of Advanced Computer Science and Applications, 12(7).‏

Tabassum, N., Neha, F. F., Hossain, M. S., & Narman, H. S. (2021, May). A hybrid machine learning based phishing website detection technique through dimensionality reduction. In 2021 IEEE international black sea conference on communications and networking (BlackSeaCom) (pp. 1-6). IEEE.‏

Downloads

Published

2024-09-30

How to Cite

Salam Al-jaberi , A., Fadhil Rahman , S., & Faisal Raheem , I. (2024). Models in Review for the Analysis of Phishing Website URLs. Journal of Al-Qadisiyah for Computer Science and Mathematics, 16(3), Comp Page 35–43. https://doi.org/10.29304/jqcsm.2024.16.31641

Issue

Section

Computer Articles