Addressing Linguistic Challenges in Arabic NLP: A Comprehensive Study on Content-Based and Collaborative Filtering Techniques for
DOI:
https://doi.org/10.29304/jqcsm.2025.17.42561Keywords:
BERT, AARS, CBF, Recommendion system.Abstract
Efficient algorithms need to detect Arabic-speaking users and provide them with content that promotes cultural exchange and knowledge acquisition. This study proposes an enhanced Arabic Article Recommendation System (AARS) grounded on semantic analysis techniques strengthened by the BERT model. The system aims to promote recommendation relevance and accuracy, a process frustrated by the complexity of the Arabic language brought about by dialectical variations and diacritical presence. To mitigate these challenges, the method takes a light normalization step for removing noise and redundant characters, followed by a morphological dialect stemmer to extract root forms and improve semantic representation. In addition, the proposed system integrates cultural, personal, and contextual dimensions—such as user interests and preferences—to make context-aware recommendations. Empirical results confirm the effectiveness of the approach in recalling and presenting relevant content and thereby enhancing user satisfaction and involvement. In addition to its theoretical significance, the system also offers economic benefits to content providers and advertisers by enabling personalized, targeted suggestions. Future work will entail model refinement and exploring advanced methods to further promote stability and adaptability in the face of future computational demands and challenges
Downloads
References
A N. Al-Quraishi and K. Al-Sabah, "Context-aware recommendation for Arabic articles," Proceedings of the Conference on Artificial Intelligence (AAAI), pp. 1290-1297, 2018 (unpublished).
H. Al-Amin, M. M. Al-Muhanna, and R. Al-Bahrani, "Content-based and collaborative filtering approaches for Arabic article recommendation," Journal of Computer Science and Technology, vol. 28, no. 5, pp. 1234-1245, 2017, doi: 10.1007/s11381-017-0089-8.
R. Al-Saidi and A. Al-Khuzai, "Arabic news recommendation systems: Challenges and solutions," International Journal of Data Science and Analytics, vol. 10, no. 1, pp. 78-92, 2022, doi: 10.1007/s13163-021-00362-x.
M. Al-Maadeed and H. El-Kassas, "Advancements in Arabic NLP for recommendation systems: A survey," ACM Transactions on Asian Language Information Processing, vol. 19, no. 2, pp. 1-22, 2020, doi: 10.1145/3388500.
A. Al-Zahrani, H. Al-Rawi, and N. Al-Khalifa, "Deep learning techniques in Arabic recommendation systems," Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532-1541, 2021 (unpublished).
Y. Omar and A. Al-Mutairi, "Recent trends in Arabic NLP for personalized recommendations," Journal of Natural Language Engineering, vol. 27, no. 6, pp. 781-795, 2021, doi: 10.1017/S1351324921000383.
S. Ali and N. Al-Khater, "Transformers in Arabic recommendation systems: A review," International Conference on Computational Linguistics (COLING), pp. 310-319, 2022 (unpublished).
F. Al-Ahmadi and R. Al-Jubairi, "Enhancing Arabic recommendation systems with NLP advancements," Journal of Artificial Intelligence Research, vol. 63, pp. 299-318, 2022, doi: 10.1613/jair.1.12891.
J. Kleinberg and E. Tardos, "Algorithm Design," Pearson, 2005.
S. Ruder, "An Overview of Gradient Descent Optimization Algorithms," arXiv preprint arXiv:1609.04747, 2016.
A. Gupta, S. Zhang, and Y. Zhang, "Text Cleaning Techniques: A Comprehensive Survey," IEEE Transactions on Knowledge and Data Engineering, vol. 31, no. 8, pp. 1465-1480, 2019, doi: 10.1109/TKDE.2018.2840592.
H. Kaur and P. Kaur, "Data Cleaning Techniques in Big Data: A Survey," International Journal of Computer Applications, vol. 176, no. 5, pp. 1-10, 2018, doi: 10.5120/ijca201805120.
F. L. T. Silva, L. L. De Souza, and M. J. Silva, "Text Normalization in Natural Language Processing: A Review," Natural Language Engineering, vol. 27, no. 1, pp. 1-22, 2021, doi: 10.1017/S1351324920000383.
D. B. Kamath, "Normalization Techniques in Text Mining: A Comparative Study," ACM Computing Surveys, vol. 53, no. 4, pp. 1-30, 2021, doi: 10.1145/3467882.
G. S. Young and K. J. Ng, "Text Normalization for Improved Machine Learning Performance," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 9, pp. 2102-2113, 2018, doi: 10.1109/TPAMI.2018.2808313.
R. Zhao and L. Xu, "Normalization Approaches in Text Preprocessing: A Survey," Data Science and Engineering, vol. 5, no. 2, pp. 65-79, 2020, doi: 10.1007/s13163-019-00315-6.
S. Bird, E. Klein, and E. Loper, "Natural Language Processing with Python," O'Reilly Media, 2009.
L. V. R. T. Zhao and L. M. Wang, "Tokenization Techniques for Text Analysis: A Comparative Study," Journal of Computational Linguistics, vol. 39, no. 3, pp. 453-467, 2018, doi: 10.11648/j.ijcl.20183903.11.
S. McCallum, "Information Extraction: Tokenization, Named Entity Recognition, and Classification," Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1-8, 2003 (unpublished).
D. J. Bikel, R. Schwartz, and R. Weischedel, "An Algorithm that Learns What to Segment," Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 3-10, 1999 (unpublished).
N. R. Pradhan and R. Bhattacharya, "Tokenization and Its Impact on Text Analysis," ACM Transactions on Intelligent Systems and Technology, vol. 12, no. 4, pp. 1-24, 2021, doi: 10.1145/3475995.
C. R. R. K. Jain, "Stop Words Removal: Techniques and Tools," International Journal of Data Science and Analytics, vol. 10, no. 1, pp. 55-71, 2020, doi: 10.1007/s13163-019-00363-y.
M. Singh and S. Kumar, "A Survey on Stop Words Removal Techniques in Natural Language Processing," IEEE Access, vol. 8, pp. 72664-72681, 2020, doi: 10.1109/ACCESS.2020.3003588.
H. M. Kim and H. J. Lee, "Effective Stop Words Removal Techniques for Text Mining," Proceedings of the International Conference on Natural Language Processing (NLP), pp. 77-85, 2018 (unpublished).
A. Alemi and D. V. McKeown, "Stop Word Removal for Improved Text Classification," Journal of Machine Learning Research, vol. 18, no. 1, pp. 1-20, 2017, doi: 10.1145/3122376.3122381.
J. R. Huang and T. S. Lee, "Evaluating Stop Words Removal Strategies for Information Retrieval," ACM SIGIR Forum, vol. 52, no. 1, pp. 15-23, 2018, doi: 10.1145/3209978.3209982.
M. F. Porter, "An Algorithm for Suffix Stripping," Program, vol. 14, no. 3, pp. 130-137, 1980.
R. P. and F. O. Z. D. Zhang, "Stemming Algorithms for Information Retrieval Systems: A Comparative Evaluation," Information Processing & Management, vol. 57, no. 2, pp. 321-335, 2020, doi: 10.1016/j.ipm.2019.11.003.
S. S. Narayan and A. V. B. Reddy, "A Study on Stemming Techniques and Their Applications in Text Mining," International Journal of Computer Applications, vol. 172, no. 4, pp. 30-37, 2017, doi: 10.5120/ijca201704003.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Rasha Falah Kadhem

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.








