Arabic word Prediction For Next and Previous Word Using Bert & CBOW Algorithms

Hawraa Ali Taher

doi:10.29304/jqcsm.2024.16.41778

Authors

Hawraa Ali Taher Department of Computer Science, Faculty of Education for Girls, University of Kufa, Najaf, Iraq

DOI:

https://doi.org/10.29304/jqcsm.2024.16.41778

Keywords:

Arabic Language, Word Prediction, CBOW, Bert Algorithm, F-Measure

Abstract

One application of Natural Language Processing (NLP) is Next Word Prediction, also known as Language Modeling. This process involves predicting the most likely word to follow in a given sentence based on the preceding context. It has numerous widely used applications, like auto-correct, which is mostly used in emails and messages. It can also be used in Microsoft Word or Google searches to predict the next word based on past searches or global queries. The goal of Natural Language Generation (NLG) is to create language that is human-interpretable and natural. Users find text generation, and next-word prediction in particular, convenient as it makes typing faster and error-free. Consequently, an essential analysis topic for all languages is a personalized text prediction system. This paper suggests a novel approach for predicting the following word in a Arabic sentence. It is possible to minimize the total number of keystrokes a user makes by anticipating the next word in a sequence. In this work, Bert algorithm and Continuous Bag of Words(CBOW) are proposed to predict the next word in Arabic language, and predict the previous word. The Bert Algorithm is achieved the best accuracy , 90% for next word prediction, and 80% for previous word prediction . And, Continuous Bag of Words(CBOW) is achieved the best accuracy , 100% for next word prediction, and 100% for previous word prediction.

Downloads

Download data is not yet available.

References

. R.M, Duwairi, Marji, N., Sha'ban & S. Rushaidat (2014, April). Sentiment analysis in arabic tweets. In 2014 5th international conference on information and communication systems (ICICS) (pp.1-6). IEEE. DOI: 10.1109/GlobConPT57482.2022.9938153

. C. ,Aliprandi, N. Carmignani, N. Deha, P Mancarella, & M. Rubino (2008). Advances in nlp applied to word prediction. University of Pisa, Italy February.

. R. Sharma, N. Goel, N. Aggarwal, P. Kaur & C. Prakash (2019, September). Next word prediction in hindi using deep learning techniques. In 2019 International conference on data science and engineering (ICDSE) (pp. 55-60). IEEE. DOI: 10.1109/ICDSE47409.2019.8971796

. A. Atçili, O. Özkaraca, G. Sariman, & B. Patrut, (2021, October). Next Word Prediction with Deep Learning Models. In The International Conference on Artificial Intelligence and Applied Mathematics in Engineering (pp. 523-531). Cham: Springer International Publishing.

. K. Shakhovska, I. Dumyn, N. Kryvinska, & M.K. Kagita, (2021). An Approach for a Next-Word Prediction for Ukrainian Language. Wireless Communications and Mobile Computing, 2021, 1-9.DOI.org/10.1155/2021/5886119

. A.F. Ganai, & F. Khursheed, (2019, November). Predicting next word using RNN and LSTM cells: Stastical language modeling. In 2019 Fifth International Conference on Image Information Processing (ICIIP) (pp. 469-474). IEEE. DOI: 10.1109/ICIIP47207.2019.8985885

. S. Agarwal, A. Sukritin, Sharma, & A. Mishra, (2022). Next Word Prediction Using Hindi Language. In Ambient Communications and Computer Systems: Proceedings of RACCCS 2021 (pp. 99-108). Singapore: Springer Nature Singapore.

. U.Anil, &M. Akcayol, M. (2020). Deep learning based prediction model for the next purchase. Advances in Electrical and Computer Engineering, 20(2).

. S. González-Carvajal, & E.C. Garrido-Merchán, (2020). Comparing BERT against traditional machine learning text classification. arXiv preprint arXiv:2005.13012.

. M.V. Koroteev, (2021). BERT: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943.

. S., Chakraborty, M. Borhan Uddin Talukdar, P., Sikdar, & J., Uddin, (2024). An Efficient Sentiment Analysis Model for Crime Articles’ Comments using a Fine-tuned BERT Deep Architecture and Pre-Processing Techniques. Journal of Information Systems and Telecommunication (JIST), 1(45), 1.‏

. S., Sina, R., Ramin (2024). An Aspect-Level Sentiment Analysis Based on LDA Topic Modeling. Journal of Information Systems and Telecommunication (JIST), (46), 117.

. M. Milicevic, M. Baranovic, & K. Zubrinic, (2015). Application of machine learning algorithms for the query performance prediction. Advances in Electrical and Computer Engineering, 15(3), 33-44.

. Karani, D. (2018). Introduction to word embedding and word2vec. Towards Data Science, 1.

. M., Jaderyan, & H., Khotanlou(2020). SGF (Semantic Graphs Fusion): A Knowledge-based Representation of Textual Resources for Text Mining Applications. Journal of Information Systems and Telecommunication (JIST), 2(26), 120.‏

. S. Sivakumar, L.S. Videla,T.R. Kumar, J. Nagaraj, S. Itnal, & D. Haritha, (2020, September). Review on word2vec word embedding neural net. In 2020 international conference on smart electronics and communication (ICOSEC) (pp. 282-290). IEEE.‏ DOI: 10.1109/ICOSEC49089.2020.9215319

. R. Wang, & J. Li, (2019, July). Bayes test of precision, recall, and F1 measure for comparison of two natural language processing models. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 4135-4145).

. D. Jatnika, M.A. Bijaksana, & A.A.Suryani, (2019). Word2vec model analysis for semantic similarities in english words. Procedia Computer Science, 157, 160-167.

. B. Sidaoui, &K. Sadouni, (2023). Epilepsy Seizure Prediction from EEG Signal Using Machine Learning Techniques. Advances in Electrical & Computer Engineering, 23(2).‏D. Karani, (2018). Introduction to word embedding and word2vec. Towards Data Science, 1.

. Taher, H. A., Abdulameer, M. H., & Mahdi, B. (2022). Information Retrieval Scheme Via Similarity Technique. International Journal on Technical and Physical Problems of Engineering (IJTPE), (51), 375-379

Arabic word Prediction For Next and Previous Word Using Bert & CBOW Algorithms

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

indexed

Make a Submission

Information

Developed By

journaldetails

details

Journal Details

Journal Policy

Aims and Scope

About Paper Review

Review Process

Abstracting and Indexing

Feedback

guidelines

Guidelines for Authors

Instruction for Authors

Copyright Agreement

DECLARATION FORM

Example of Published Paper

Licenses and Copyright

Publishing Fees:

Current Issue

Journal of Al-Qadisiyah for computer science and mathematics (JQCSM)

ISSN 2521-3504 (Online), ISSN 2074-0204 (Print)

It is scientific journal issued by College of computer Science and IT / University of Al-Qadisiyah