Malware classification using SVM and XGBoost: A study of the influence of features
DOI:
https://doi.org/10.29304/jqcsm.2025.17.42536Keywords:
Zero-day, machine learning, SVMAbstract
Zero-day malware shows a significant cybersecurity threat due to its ability to avoid traditional antivirus detection. This study investigates the application of machine learning techniques to detect and classify such advanced threats. Specifically, two well-known machine learning classifiers: Support Vector Machine (SVM) and XGBoost, to classify malware in a static analysis environment. The data was collected from the Kaggle platform and includes 5212 samples containing 70 features extracted from PE (Portable Executable) headers, including features from the DOS Header, PE Header, and Optional Header. The samples include both malicious and healthy files, and ten-fold cross-validation was used. The performance of the classifiers was evaluated using metrics such as accuracy, recall, F1 score, Roc curve and positive precision.
The results demonstrate XGBoost's performance was superior with an accuracy of about 99%, while SVM's accuracy was about 96%. XGBoost shows its high effectiveness and low error rate, which enhances its potential to be used as an effective tool in detecting zero-day malware and improving cyber security capabilities to detect unknown threats.
Downloads
References
M. Nassereddine and A. Khang, Applications of Internet of Things (IoT) in Smart Cities, in Advanced IoT Technologies and Applications in the Industry 4.0 Digital Economy, CRC Press, 2024, pp. 109–136.
M. Hossain, G. Kayas, R. Hasan, A. Skjellum, S. Noor, and S. R. Islam, "A holistic analysis of Internet of Things (IoT) security: Principles, practices, and new perspectives," Future Internet, vol. 16, no. 2, p. 40, 2024.
Z. Azam, M. M. Islam, and M. N. Huda, "Comparative analysis of intrusion detection systems and machine learning-based model analysis through decision tree," IEEE Access, 2023.
Z. Chen et al., "A survey on security threats and countermeasures in the Internet of Things," ACM Computing Surveys, Apr. 2022. [Online]. Available: https://doi.org/10.48550/arXiv.2204.03433
Ö. A. Aslan and R. Samet, "A comprehensive review on malware detection approaches," IEEE Access, vol. 8, pp. 6249–6271, 2020, doi: 10.1109/ACCESS.2019.2963724.
H. Rathore, S. Agarwal, S. K. Sahay, and M. Sewak, "Malware detection using machine learning and deep learning," in Big Data Analytics, A. Mondal, H. Gupta, J. Srivastava, P. Reddy, and D. Somayajulu, Eds., Cham: Springer, 2018, pp. 402–411, doi: 10.1007/978-3-030-04780-1_28.
S. V. N. Santhosh Kumar, M. Selvi, and A. Kannan, "A comprehensive survey on machine learning-based intrusion detection systems for secure communication in Internet of Things," Computational Intelligence and Neuroscience, vol. 2023, p. 8981988, 2023.
V. Kukartsev et al., "Using machine learning techniques to simulate network intrusion detection," in 2024 Int. Conf. on Intelligent Systems for Cybersecurity (ISCS), 2024, pp. 1–4.
M. A. Amer and R. Svyd, "Intelligent cyber-attack detection in IoT networks using IDAOA-based wrapper feature selection," Wasit Journal for Pure Science, Jun. 2025. [Online]. Available: https://doi.org/10.31185/wjps.731
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Mohamed Fadhil Imran

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.








