A Systematic Review and Experimental Evaluation of Classical and Transformer-Based Models for Arabic Extractive Text Summarization

Authors

  • Hind R. Almayyali جامعة الكوفة - كلية علوم الحاسوب والرياضيات

DOI:

https://doi.org/10.29304/jqcsm.2025.17.42562

Keywords:

Arabic text summarization; Extractive text summarization; Abstractive summarization; Natural language processing; Transformer models; Arabic NLP.

Abstract

Arabic text summarization has become an active research area due to the rapid growth of Arabic digital content. developing effective summarization models has many  challenges result from the linguistic richness, the complex morphology, flexible syntax, and diverse writing styles. This study looks at Arabic extractive summarization, where the goal is to select the most relevant sentences from a text to create a concise version that still captures the original meaning. Many techniques were analyzed and synthesized with the existing data sets in addition to identifying the problems and gaps to understand the history of the given area and determine the direction of the research. Also, the metrics that used to evaluate the outcomes of the Arabic text summarization are mentioned. The review pointed out an evident development of classical statistical and graph-based extractive procedures to current transformer-based procedures.  

Transformer architectures have been quickly embraced by the field and Arabic-specific pre-trained models have proven to perform better than multilingual counterparts. Nevertheless, there are still great gaps in multi-document summarization, dialect management and formalized evaluation systems. The research work that is to be done in the future is the creation of bigger Arabic corpora, better dialectal coverage, and creation of full-fledged evaluation standards.

Downloads

Download data is not yet available.

References

M. Al-Maleh and S. Desouki, “Arabic text summarization using deep learning approach,” J Big Data, vol. 7, no. 1, pp. 1–17, Dec. 2020, doi: 10.1186/S40537-020-00386-7/TABLES/6.

A. M. Al-Numai and A. M. Azmi, “Arabic Abstractive Text Summarization Using an Ant Colony System,” Mathematics 2025, Vol. 13, Page 2613, vol. 13, no. 16, p. 2613, Aug. 2025, doi: 10.3390/MATH13162613.

H. Shakil, A. Farooq, and J. Kalita, “Abstractive text summarization: State of the art, challenges, and improvements,” Neurocomputing, vol. 603, p. 128255, Oct. 2024, doi: 10.1016/J.NEUCOM.2024.128255.

M. Gamal, M. A. Salam, H. F. A. Hamed, and S. Sweidan, “ACOSUM: Ant Colony Optimized Multi-Level Semantic Graph Summarization,” vol. 1, no. 1, p. 16, 2025, doi: 10.21608/ijaici.2025.350645.1006.

J. Zhou, Z. Ye, S. Zhang, Z. Geng, N. Han, and T. Yang, “Investigating response behavior through TF-IDF and Word2vec text analysis: A case study of PISA 2012 problem-solving process data,” Heliyon, vol. 10, no. 16, p. e35945, Aug. 2024, doi: 10.1016/J.HELIYON.2024.E35945.

A. A. Aladeemy et al., “Advancements and challenges in Arabic sentiment analysis: A decade of methodologies, applications, and resource development,” Heliyon, vol. 10, no. 21, p. e39786, Nov. 2024, doi: 10.1016/J.HELIYON.2024.E39786.

S. Albitar, S. Fournier, and B. Espinasse, “An effective TF/IDF-based text-to-text semantic similarity measure for text classification,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8786, pp. 105–114, 2014, doi: 10.1007/978-3-319-11749-2_8.

M. Bugueño and G. de Melo, “Connecting the Dots: What Graph-Based Text Representations Work Best for Text Classification Using Graph Neural Networks?,” Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 8943–8960, Jan. 2024, doi: 10.18653/v1/2023.findings-emnlp.600.

A. E. Martin, “A Compositional Neural Architecture for Language,” J Cogn Neurosci, vol. 32, no. 8, pp. 1407–1427, 2020, doi: 10.1162/JOCN_A_01552.

W. Antoun, F. Baly, and H. Hajj, “AraBERT: Transformer-based Model for Arabic Language Understanding,” 2020. Accessed: Sep. 28, 2025. [Online]. Available: https://aclanthology.org/2020.osact-1.2/

K. N. Elmadani, M. Elgezouli, and A. Showk, “BERT Fine-tuning For Arabic Text Summarization,” presented at AfricaNLP Workshop, ICLR 2020, Mar. 2020, Accessed: Oct. 24, 2025. [Online]. Available: https://arxiv.org/pdf/2004.14135

M. Kahla, Z. Gy˝, O. Yang, and A. Novák, “Cross-lingual Fine-tuning for Abstractive Arabic Text Summarization,” 2021. doi: 10.26615/978-954-452-072-4_074.

A. Safaya, M. Abdullatif, and D. Yuret, “KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media,” 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings, pp. 2054–2059, 2020, doi: 10.18653/V1/2020.SEMEVAL-1.271.

M. Abdul-Mageed, A. R. Elmadany, and E. M. B. Nagoudi, “ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic,” ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, vol. 1, pp. 7088–7105, 2021, doi: 10.18653/V1/2021.ACL-LONG.551.

G. Inoue, B. Alhafni, N. Baimukan, H. Bouamor, and N. Habash, “The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models,” 2021. Accessed: Oct. 24, 2025. [Online]. Available: https://aclanthology.org/2021.wanlp-1.10/

R. Elbarougy, G. Behery, and A. El Khatib, “Extractive Arabic Text Summarization Using Modified PageRank Algorithm,” Egyptian Informatics Journal, vol. 21, no. 2, pp. 73–81, Jul. 2020, doi: 10.1016/J.EIJ.2019.11.001.

A. M. A. Nada, E. Alajrami, A. A. Al-Saqqa, and S. S. Abu-Naser, “Arabic Text Summarization Using AraBERT Model Using Extractive Text Summarization Approach,” 2020.

N. Alami, M. Meknassi, N. En-nahnahi, Y. El Adlouni, and O. Ammor, “Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling,” Expert Syst Appl, vol. 172, Jun. 2021, doi: 10.1016/J.ESWA.2021.114652.

M. Gouiouez, “A Fuzzy Near Neighbors Approach for Arabic Text Categorization Based on Web Mining Technique,” Lecture Notes in Networks and Systems, vol. 211 LNNS, pp. 575–584, 2021, doi: 10.1007/978-3-030-73882-2_52.

Y. M. Wazery, M. E. Saleh, A. Alharbi, and A. A. Ali, “Abstractive Arabic Text Summarization Based on Deep Learning,” Comput Intell Neurosci, vol. 2022, 2022, doi: 10.1155/2022/1566890.

Y. Einieh, A. AlMansour, and A. Jamal, “Arabic Extractive Summarization Using Pre-Trained Models,” Journal of King Abdulaziz University: Computing and Information Technology Sciences, vol. 12, no. 1, pp. 63–73, Jul. 2023, doi: 10.4197/Comp.12-1.6.

A. Elsaid, A. Mohammed, L. Fattouh, and M. Sakre, “An Efficient Deep Learning Approach for Extractive Arabic Text Summarization Based on Multiple Encoders and a Single Decoder,” 1st International Conference of Intelligent Methods, Systems and Applications, IMSA 2023, pp. 1–6, 2023, doi: 10.1109/IMSA58542.2023.10217361.

M. J. Hadi, A. R. Abbas, and O. Y. Fadhil, “A Novel Gravity Optimization Algorithm for Extractive Arabic Text Summarization,” Baghdad Science Journal, vol. 21, no. 2, pp. 537–547, 2024, doi: 10.21123/BSJ.2023.7731.

G. Alselwi and T. Taşcı, “Extractive Arabic Text Summarization Using PageRank and Word Embedding,” Arab J Sci Eng, vol. 49, no. 9, pp. 13115–13130, Sep. 2024, doi: 10.1007/S13369-024-08890-1/TABLES/9.

G. Bourahouat, M. Abourezq, and N. Daoudi, “Toward an efficient extractive Arabic text summarisation system based on Arabic large language models,” Int J Data Sci Anal, vol. 20, no. 3, pp. 2445–2457, Sep. 2024, doi: 10.1007/S41060-024-00618-6/METRICS.

H. Zaiton, A. Fashwan, and S. Alansary, “Leveraging Transformer Summarizer to Extract Sentences for Arabic Text Summarization,” Procedia Comput Sci, vol. 244, pp. 353–362, Jan. 2024, doi: 10.1016/J.PROCS.2024.10.209.

E. Monir and A. Salah, “AraTSum: Arabic Twitter Trend Summarization Using Topic Analysis and Extractive Algorithms,” International Journal of Computational Intelligence Systems, vol. 17, no. 1, pp. 1–18, Dec. 2024, doi: 10.1007/S44196-024-00546-0/FIGURES/8.

“ROUGE: A Package for Automatic Evaluation of Summaries - ACL Anthology.” Accessed: Oct. 24, 2025. [Online]. Available: https://aclanthology.org/W04-1013/

A. Nenkova, S. Maskey, I. Research, and Y. Liu, “Automatic Summarization Why summarize?,” Information Retrieval, vol. 5, pp. 103–233, 2011.

K. Owczarzak, J. M. Conroy, H. T. Dang, and A. Nenkova, “An Assessment of the Accuracy of Automatic Evaluation in Summarization,” 2012. Accessed: Oct. 24, 2025. [Online]. Available: https://aclanthology.org/W12-2601/

D. Yadav, J. Desai, and A. K. Yadav, “Automatic Text Summarization Methods: A Comprehensive Review,” Mar. 2022, Accessed: Sep. 26, 2025. [Online]. Available: http://arxiv.org/abs/2204.01849

F. B. Fikri, K. Oflazer, and B. Yanıkoğlu, “Semantic Similarity Based Evaluation for Abstractive News Summarization,” GEM 2021 - 1st Workshop on Natural Language Generation, Evaluation, and Metrics, Proceedings, pp. 24–33, 2021, doi: 10.18653/V1/2021.GEM-1.3.

A. Alhamadani, X. Zhang, J. He, A. Khatri, and C. T. Lu, “LANS: Large-scale Arabic News Summarization Corpus,” ArabicNLP 2023 - 1st Arabic Natural Language Processing Conference, Proceedings, pp. 89–100, 2023, doi: 10.18653/v1/2023.arabicnlp-1.8.

Y. A. AL-Khassawneh and E. S. Hanandeh, “Extractive Arabic Text Summarization-Graph-Based Approach,” Electronics (Switzerland), vol. 12, no. 2, Jan. 2023, doi: 10.3390/ELECTRONICS12020437.

N. Burmani, H. Alami, S. Lafkiar, M. Zouitni, M. Taleb, and N. E. Nahnahi, “Graph based method for Arabic text summarization,” 2022 International Conference on Intelligent Systems and Computer Vision, ISCV 2022, 2022, doi: 10.1109/ISCV54655.2022.9806127.

T. Hasan et al., “XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages,” Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 4693–4703, Jun. 2021, doi: 10.18653/v1/2021.findings-acl.413.

A. Elsaid, A. Mohammed, L. Fattouh, and M. Sakre, “Abstractive Arabic Text Summarization Based on MT5 and AraBart Transformers,” 1st International Conference of Intelligent Methods, Systems and Applications, IMSA 2023, pp. 7–12, 2023, doi: 10.1109/IMSA58542.2023.10217539.

A. Atef, F. Seddik, and A. Elbedewy, “AGS: Arabic GPT Summarization Corpus,” 4th International Conference on Electrical, Communication and Computer Engineering, ICECCE 2023, 2023, doi: 10.1109/ICECCE61019.2023.10441794.

H. Rhel and D. Roussinov, “Large Language Models and Arabic Content: A Review,” May 2025, Accessed: Sep. 26, 2025. [Online]. Available: https://arxiv.org/pdf/2505.08004

M. Kurt Pehlivanoğlu, R. T. Gobosho, M. A. Syakura, V. Shanmuganathan, and L. de-la-Fuente-Valentín, “Comparative analysis of paraphrasing performance of ChatGPT, GPT-3, and T5 language models using a new ChatGPT generated dataset: ParaGPT,” Expert Syst, vol. 41, no. 11, Nov. 2024, doi: 10.1111/EXSY.13699.

B. Mousi et al., “AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs,” n Proceedings of the 31st International Conference on Computational Linguistics, pages 4186–4218, Abu Dhabi, UAE. Association for Computational Linguistics., pp. 4186–4218, 2025, Accessed: Sep. 29, 2025. [Online]. Available: https://aclanthology.org/2025.coling-main.283/

A. Charfi, M. Bessghaier, A. Atalla, R. Akasheh, S. Al-Emadi, and W. Zaghouani, “Stance detection in Arabic with a multi-dialectal cross-domain stance corpus,” Soc Netw Anal Min, vol. 14, no. 1, Dec. 2024, doi: 10.1007/S13278-024-01335-5.

Downloads

Published

2025-12-30

How to Cite

Hind R. Almayyali. (2025). A Systematic Review and Experimental Evaluation of Classical and Transformer-Based Models for Arabic Extractive Text Summarization. Journal of Al-Qadisiyah for Computer Science and Mathematics, 17(4), Comp 300–308. https://doi.org/10.29304/jqcsm.2025.17.42562

Issue

Section

Computer Articles