Enhanced Phishing URL Identification through Recurrent Neural Networks: A Comparative Study of LSTM and BiLSTM
DOI:
https://doi.org/10.29304/jqcsm.2025.17.32380Keywords:
Phishing Detection, URL ClassificationAbstract
Phishing attacks continue to pose a serious threat to cybersecurity, underscoring the need for effective and scalable detection methods. This study evaluates the performance of Recurrent Neural Network (RNN) architectures—specifically Long Short-Term Memory (LSTM) and Bidirectional LSTM (BiLSTM)—for detecting phishing websites based on the sequential patterns in URL structures and webpage content. The LSTM model achieved an Area Under the Curve (AUC) of 0.92, with an overall accuracy of 98.4%, precision of 98.9%, and recall of 97.1%. These results indicate a strong ability to identify phishing URLs with a low false positive rate, although performance declined when detecting sophisticated or zero-day phishing attempts. The BiLSTM model, which incorporates bidirectional context, achieved a higher AUC of 0.95 and improved precision of 91% at a recall of 89%. However, it exhibited a slightly lower overall accuracy of 97.9% and a higher false negative rate. Both models effectively differentiated phishing from legitimate URLs, with BiLSTM offering improved context awareness but at the cost of reduced recall. The results suggest that while BiLSTM enhances contextual understanding, the LSTM model offers better generalization and computational efficiency for real-time deployment. This work highlights the potential of RNN-based models in phishing detection and the importance of balancing sensitivity and specificity in cybersecurity applications.
Downloads
References
ABDULRAHMAN, Lozan Mohammed, AHMED, Sarkar Hasan, RASHID, Zryan Najat, et al. Web phishing detection using web crawling, cloud infrastructure and deep learning framework. Journal of Applied Science and Technology Trends, 2023, vol. 4, no 01, p. 54-71.
SHAFIN, Sakib Shahriar. An explainable feature selection framework for web phishing detection with machine learning. Data Science and Management, 2024.
A. K. Murthy and Suresha, ‘‘XML URL classification based on their semantic structure orientation for web mining applications,’’ Proc. Comput. Sci., vol. 46, pp. 143–150, Jan. 2015.
A. A. Ubing, S. Kamilia, A. Abdullah, N. Jhanjhi, and M. Supramaniam, ‘‘Phishing website detection: An improved accuracy through feature
selection and ensemble learning,’’ Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 1, pp. 252–257, 2019.
A. Aggarwal, A. Rajadesingan, and P. Kumaraguru, ‘‘PhishAri: Automatic realtime phishing detection on Twitter,’’ in Proc. eCrime Res. Summit, Oct. 2012, pp. 1–12.
S. N. Foley, D. Gollmann, and E. Snekkenes, Computer Security— ESORICS 2017, vol. 10492. Oslo, Norway: Springer, Sep. 2017.
P. George and P. Vinod, ‘‘Composite email features for spam identification,’’ in Cyber Security. Singapore: Springer, 2018, pp. 281–289.
H. S. Hota, A. K. Shrivas, and R. Hota, ‘‘An ensemble model for detecting phishing attack with proposed remove-replace feature selection
technique,’’ Proc. Comput. Sci., vol. 132, pp. 900–907, Jan. 2018.
G. Sonowal and K. S. Kuppusamy, ‘‘PhiDMA—A phishing detection model with multi-filter approach,’’ J. King Saud Univ., Comput. Inf. Sci., vol. 32, no. 1, pp. 99–112, Jan. 2020.
M. Zouina and B. Outtaj, ‘‘A novel lightweight URL phishing detection system using SVM and similarity index,’’ Hum.-Centric Comput. Inf. Sci., vol. 7, no. 1, p. 17, Jun. 2017.
RAJESWARY, C. et THIRUMARAN, M. A comprehensive survey of automated website phishing detection techniques: A perspective of artificial intelligence and human behaviors. In : 2023 International conference on sustainable computing and data communication systems (ICSCDS). IEEE, 2023. p. 420-427.
R. Prasad and V. Rohokale, ‘‘Cyber threats and attack overview,’’ in Cyber Security: The Lifeline of Information and Communication Technology. Cham, Switzerland: Springer, 2020, pp. 15–31.
T. Nathezhtha, D. Sangeetha, and V. Vaidehi, ‘‘WC-PAD: Web crawling based phishing attack detection,’’ in Proc. Int. Carnahan Conf. Secur. Technol. (ICCST), Oct. 2019, pp. 1–6.
R. Jenni and S. Shankar, ‘‘Review of various methods for phishing detection,’’ EAI Endorsed Trans. Energy Web, vol. 5, no. 20, Sep. 2018, Art. no. 155746.
(2020). Accessed: Jan. 2020. [Online]. Available: https://catches-of-themonth-phishing-scams-for-january-2020
S. Bell and P. Komisarczuk, ‘‘An analysis of phishing blacklists: Google safe browsing, OpenPhish, and PhishTank,’’ in Proc. Australas. Comput. Sci. Week Multiconf. (ACSW), Melbourne, VIC, Australia. New York, NY, USA: Association for Computing Machinery, 2020, pp. 1–11, Art. no. 3, doi: 10.1145/3373017.3373020.
RAHMAN, Sheikh Shah Mohammad Motiur, GOPE, Lakshman, ISLAM, Takia, et al. IntAnti-Phish: An intelligent anti-phishing framework using backpropagation neural network. Machine Intelligence and Big Data Analytics for Cybersecurity Applications, 2021, p. 217-230.
Ozgur Koray Sahingoz, Ebubekir Buber, Emin Kugu (2023). Phishing Attack Dataset. IEEE Dataport. https://dx.doi.org/10.21227/4098-8c60
Gopal, S. B., Poongodi, C., Nanthiya, D., Kirubakaran, T., Kulavishnusaravanan, B., & Logeshwar, D. (2023). Autoencoder-based architecture for identification and mitigating phishing URL attack in IoT using DNN. Journal of The Institution of Engineers (India): Series B, 104(6), 1227-1240.
Bozkir, A. S., Dalgic, F. C., & Aydos, M. (2023). GramBeddings: A new neural network for URL based identification of phishing web pages through n-gram embeddings. Computers & Security, 124, 102964.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Wadhah Sata Kathum Ajjam, Abdullahi Abdu Ibrahim

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.