Preprocessing of Drugs Reviews and Classification Techniques

Authors

  • Rosul Ibrahim Kazim Department of Computer Science, Collage of Education, University of Kufa, Najaf, Iraq
  • Enas Fadhil Abdullah Department of Computer Science, Collage of Education for Girls, University of Kufa, Najaf, Iraq

DOI:

https://doi.org/10.29304/jqcm.2023.15.3.1261

Keywords:

Sentiment analysis, NLP, Machine Learning.

Abstract

NLP refers to the computer-based study of language when referring to textual data such as texts or reviews. Natural language processing frequently seeks to give otherwise unstructured natural language a representation of the text that gives it structure. Marketing, social media, and customer relationship management are just a few of the businesses that heavily rely on sentiment analysis (SA), one of the methods used in opinion mining, typically assesses a textual review's structure to see if it conveys a positive or negative impression .In this paper, the challenge was to convert unstructured texts into structured texts for drug datasets, and interlocking between sentiment analysis, the mechanism that was adopted in the paper starts with several stages. The first stage is data preprocessing with natural language processing techniques, and the next stage is the prediction step with classification models are logistic regression(LR), random forest(RF), Naïve Bays(NB), and Support Vector Machine (SVM),  the best Result  of prediction accuracy 92%,for random forest .

Downloads

Download data is not yet available.

References

[1] Kumar, S., De, K., & Roy, P. P. (2020). Movie recommendation system using sentiment analysis from microblogging data. IEEE Transactions on Computational Social Systems, 7(4), 915-923.
[2] Al-Ghuribi, S. M., & Noah, S. A. M. (2021). A comprehensive overview of recommender system and sentiment analysis. arXiv preprint
arXiv:2109.08794
[3] Castro, F., Gelbukh, A., & González, M. (Eds.). (2013). Advances in Soft Computing and Its Applications: 12th Mexican International
Conference, MICAI 2013, Mexico City, Mexico, November 24-30, 2013, Proceedings, Part II (Vol. 8266). Springer.

[4] D. Naga Swathi, Kumaran.U,” An Effective Stratified K-Fold Algorithm with Logistic Regression for Drug Feedback Data”, International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878 (Online), Volume-8 Issue-6, March 2020
[5] Bemila, T., Kadam, I., Sidana, A., & Zemse, S. (2020, April). An approach to sentimental analysis of drug reviews using RNN-BiLSTM model. In Proceedings of the 3rd international conference on advances in science & technology (ICAST).
[6] Uddin, M. N., Hafiz, M. F. B., Hossain, S., & Islam, S. M. M. (2022). Drug Sentiment Analysis using Machine Learning Classifiers. International Journal of Advanced Computer Science and Applications . ,
[7] Vijayaraghavan, S., & Basu, D. (2020). Sentiment analysis in drug reviews using supervised machine learning algorithms. arXiv preprint arXiv:2003.11643.
[8] Nahma, D. R., & Abbas, A. R. (2020). Patient Opinion Mining: Analysis of Patient Drugs Satisfaction using Support Vector Machine and Logistic Regression Algorithm. Journal of Madenat Alelem College Vol, 12(2).
[9] Shiju, A., & He, Z. (2022, June). Classifying drug ratings using user reviews with transformer-based language models. In 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI) (pp. 163-169). IEEE.
[10] Hirschberg, J., & Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245), 261-266

[11] Jain, A., Jain, V., & Kapoor, N. (2016). A literature survey on recommendation system based on sentimental analysis. Advanced Computational Intelligence, 3(1), 25-36
[12] Sun, S., Luo, C., & Chen, J. (2017). A review of natural language processing techniques for opinion mining systems. Information fusion, 36, 10-25

[13] Ricci, F., Rokach, L., & Shapira, B. (2010). Introduction to recommender systems handbook. In Recommender systems handbook (pp. 1-35) Boston, MA: springer US.
[14] Müller, A. C., & Guido, S. (2016). Introduction to machine learning with Python: a guide for data scientists. " O'Reilly Media, Inc.".

[15] Hilbe, J. M. (2009). Logistic regression models. CRC press
.
[16] Liu, Yingchun. "Random forest algorithm in big data environment." Computer modelling & new technologies 18.12A (2014): 147-151.

[17] Schonlau, Matthias, and Rosie Yuyan Zou. "The random forest algorithm for statistical learning." The Stata Journal 20.1 (2020): 3-29.
[18] Lin, Weiwei, et al. "An ensemble random forest algorithm for insurance big data analysis." Ieee access 5 (2017): 16568-16575

[19] Ray, S. (2018). A Comparative Analysis and Testing of Supervised Machine Learning Algorithms

[20] Lowd, D., & Domingos, P. (2005, August). Naive Bayes models for probability estimation. In Proceedings of the 22nd international conference on Machine learning (pp. 529-536)
[21]Makridakis, Spyros. "Accuracy measures: theoretical and practical concerns." International journal of forecasting 9.4 (1993): 527-529.
[22] Garg, Satvik. "Drug recommendation system based on sentiment analysis of drug reviews using machine learning." 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence). IEEE, 2021
[23] Dataset Available in /
https://archive.ics.uci.edu/ml/datasets/Drug+Review+Dataset+%28Druglib.com%29
[24] ] Guo, A., & Yang, T. (2016, May). Research and improvement of feature words weight based on TFIDF algorithm. In 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference (pp. 415-419) IEEE.

Downloads

Published

2023-09-30

How to Cite

Kazim, R. I., & Abdullah, E. F. (2023). Preprocessing of Drugs Reviews and Classification Techniques. Journal of Al-Qadisiyah for Computer Science and Mathematics, 15(3), Comp Page 1–10. https://doi.org/10.29304/jqcm.2023.15.3.1261

Issue

Section

Computer Articles