Preprocessing of Drugs Reviews and Classification Techniques
DOI:
https://doi.org/10.29304/jqcm.2023.15.3.1261Keywords:
Sentiment analysis, NLP, Machine Learning.Abstract
NLP refers to the computer-based study of language when referring to textual data such as texts or reviews. Natural language processing frequently seeks to give otherwise unstructured natural language a representation of the text that gives it structure. Marketing, social media, and customer relationship management are just a few of the businesses that heavily rely on sentiment analysis (SA), one of the methods used in opinion mining, typically assesses a textual review's structure to see if it conveys a positive or negative impression .In this paper, the challenge was to convert unstructured texts into structured texts for drug datasets, and interlocking between sentiment analysis, the mechanism that was adopted in the paper starts with several stages. The first stage is data preprocessing with natural language processing techniques, and the next stage is the prediction step with classification models are logistic regression(LR), random forest(RF), Naïve Bays(NB), and Support Vector Machine (SVM), the best Result of prediction accuracy 92%,for random forest .
Downloads
References
[2] Al-Ghuribi, S. M., & Noah, S. A. M. (2021). A comprehensive overview of recommender system and sentiment analysis. arXiv preprint
arXiv:2109.08794
[3] Castro, F., Gelbukh, A., & González, M. (Eds.). (2013). Advances in Soft Computing and Its Applications: 12th Mexican International
Conference, MICAI 2013, Mexico City, Mexico, November 24-30, 2013, Proceedings, Part II (Vol. 8266). Springer.
[4] D. Naga Swathi, Kumaran.U,” An Effective Stratified K-Fold Algorithm with Logistic Regression for Drug Feedback Data”, International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878 (Online), Volume-8 Issue-6, March 2020
[5] Bemila, T., Kadam, I., Sidana, A., & Zemse, S. (2020, April). An approach to sentimental analysis of drug reviews using RNN-BiLSTM model. In Proceedings of the 3rd international conference on advances in science & technology (ICAST).
[6] Uddin, M. N., Hafiz, M. F. B., Hossain, S., & Islam, S. M. M. (2022). Drug Sentiment Analysis using Machine Learning Classifiers. International Journal of Advanced Computer Science and Applications . ,
[7] Vijayaraghavan, S., & Basu, D. (2020). Sentiment analysis in drug reviews using supervised machine learning algorithms. arXiv preprint arXiv:2003.11643.
[8] Nahma, D. R., & Abbas, A. R. (2020). Patient Opinion Mining: Analysis of Patient Drugs Satisfaction using Support Vector Machine and Logistic Regression Algorithm. Journal of Madenat Alelem College Vol, 12(2).
[9] Shiju, A., & He, Z. (2022, June). Classifying drug ratings using user reviews with transformer-based language models. In 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI) (pp. 163-169). IEEE.
[10] Hirschberg, J., & Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245), 261-266
[11] Jain, A., Jain, V., & Kapoor, N. (2016). A literature survey on recommendation system based on sentimental analysis. Advanced Computational Intelligence, 3(1), 25-36
[12] Sun, S., Luo, C., & Chen, J. (2017). A review of natural language processing techniques for opinion mining systems. Information fusion, 36, 10-25
[13] Ricci, F., Rokach, L., & Shapira, B. (2010). Introduction to recommender systems handbook. In Recommender systems handbook (pp. 1-35) Boston, MA: springer US.
[14] Müller, A. C., & Guido, S. (2016). Introduction to machine learning with Python: a guide for data scientists. " O'Reilly Media, Inc.".
[15] Hilbe, J. M. (2009). Logistic regression models. CRC press
.
[16] Liu, Yingchun. "Random forest algorithm in big data environment." Computer modelling & new technologies 18.12A (2014): 147-151.
[17] Schonlau, Matthias, and Rosie Yuyan Zou. "The random forest algorithm for statistical learning." The Stata Journal 20.1 (2020): 3-29.
[18] Lin, Weiwei, et al. "An ensemble random forest algorithm for insurance big data analysis." Ieee access 5 (2017): 16568-16575
[19] Ray, S. (2018). A Comparative Analysis and Testing of Supervised Machine Learning Algorithms
[20] Lowd, D., & Domingos, P. (2005, August). Naive Bayes models for probability estimation. In Proceedings of the 22nd international conference on Machine learning (pp. 529-536)
[21]Makridakis, Spyros. "Accuracy measures: theoretical and practical concerns." International journal of forecasting 9.4 (1993): 527-529.
[22] Garg, Satvik. "Drug recommendation system based on sentiment analysis of drug reviews using machine learning." 2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence). IEEE, 2021
[23] Dataset Available in /
https://archive.ics.uci.edu/ml/datasets/Drug+Review+Dataset+%28Druglib.com%29
[24] ] Guo, A., & Yang, T. (2016, May). Research and improvement of feature words weight based on TFIDF algorithm. In 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference (pp. 415-419) IEEE.