Modeling Cholesterol Levels Using Machine Learning: A Study with the Framingham Heart Study Dataset
DOI:
https://doi.org/10.29304/jqcsm.2025.17.42567Keywords:
Cholesterol; Predictive Model; Feature Importance; Framingham; Heart Study; Classification; Risk Factors.Abstract
Cholesterol levels are associated with many health risks, especially cardiovascular disease. Therefore, predicting an individual's cholesterol levels is important to avoid such complications. This paper explores the determinants of blood cholesterol level and builds a machine learning model to predict cholesterol levels using the Framingham Heart Study dataset. Factors such as age, body mass index, and glucose levels were analyzed. The results showed that these factors are the most influential in determining cholesterol levels. Random Forest achieved the highest accuracy in predicting cholesterol levels in the three-level case (78%) and the binary case (88%). These findings indicate that machine learning can effectively identify individuals at risk of elevated cholesterol and highlight the usefulness of focusing on the main influential factors in preventive healthcare and early risk assessment.
Downloads
References
Huff, Trevor & Brandon Boyd & Ishwarlal Jialal. (2023). Physiology, Cholesterol. In StatPearls. StatPearls Publishing. Available from:
https://www.ncbi.nlm.nih.gov/books/NBK470561/
Duan Y, Gong K, Xu S, Zhang F, Meng X, Han J. Regulation of cholesterol homeostasis in health and diseases: from mechanisms to targeted therapeutics. Signal Transduct Target Ther. 2022 Aug 2;7(1):265. doi: 10.1038/s41392-022-01125-5. PMID: 35918332; PMCID: PMC9344793.
Dybiec, Jill & Baran, Wiktoria & Dąbek, Bartłomiej & Fularski, Piotr & Młynarska, Ewelina & Radzioch, Ewa & Rysz, Jacek & Franczyk, Beata. (2023). Advances in Treatment of Dyslipidemia. International Journal of Molecular Sciences. 24. 13288. 10.3390/ijms241713288.
Sureshbabu, Jayanthi. (2023). Importance and Need of Medical Entomology and Medical Entomologist in Public Health. International Journal of Medical Sciences and Nursing Research. 3. 1-2. 10.55349/ijmsnr.20233312.
Hidayat, Anas. (2023). MANAGEMENT OF INCREASING PUBLIC KNOWLEDGE ABOUT THE IMPORTANCE OF MEDICAL RECORDS IN HEALTH CARE FACILITIES. Jurnal Pengabdian Masyarakat Permata Indonesia. 3. 7-11. 10.59737/jpmpi.v3i1.218.
Dickens, Brian & Sassanpour, Mana & Bischoff, Evan. (2023). The Effect of Chia Seeds on High-Density Lipoprotein (HDL) Cholesterol. Cureus. 15. 10.7759/cureus.40360.
Zhu, Chen & Wu, Juan & Wu, Yixian & Guo, Wen & Lu, Jing & Zhu, Wenfang & Li, Xiaona & Xu, Nianzhen & Zhang, Qun. (2022). Triglyceride to high-density lipoprotein cholesterol ratio and total cholesterol to high-density lipoprotein cholesterol ratio and risk of benign prostatic hyperplasia in Chinese male subjects. Frontiers in Nutrition. 9. 10.3389/fnut.2022.999995.
Katahira, Masahito & Imai, Shu & Ono, Satoko & Moriura, Shigeaki. (2023). Estimating Triglyceride Levels Using Total Cholesterol, Low-Density Lipoprotein Cholesterol, and High-Density Lipoprotein Cholesterol Levels: A Cross-Sectional Study. Metabolic syndrome and related disorders. 21. 10.1089/met.2023.0045.
Wu, Shouling & Su, Xin & Zuo, Yingting & Chen, Shuohua & Tian, Xue & Xu, Qin & Zhang, Yijun & Zhang, Xiaoli & Wang, Penglian & He, Yan & Wang, Anxin. (2023). Discordance between remnant cholesterol and low-density lipoprotein cholesterol predicts arterial stiffness progression. Hellenic Journal of Cardiology. 10.1016/j.hjc.2023.05.008.
Siddharth, Saurav & Farooq, Bilkisa & Kumar, Nirnay & Burhan, Mirza. (2023). Effect of Lifestyle in Female Infertility: A Review Based Study. International Journal for Research in Applied Science and Engineering Technology. 11. 1777-1783. 10.22214/ijraset.2023.56307.
Hernández-Arango, Alejandro & Arias, María & Pérez, Viviana & Chavarría, Luis & Jaimes, Fabian & Mater, Alma. (2023). Prediction of the risk of adverse clinical outcomes with machine learning techniques in patients with chronic no communicable diseases.
Lukyanenko, Roman & Maass, Wolfgang & Storey, Veda. (2022). Trust in artificial intelligence: From a Foundational Trust Framework to emerging research opportunities. Electronic Markets. 32. 3. 10.1007/s12525-022-00605-4.
Ahuja, Abhimanyu. (2019). The impact of artificial intelligence in medicine on the future role of the physician. PeerJ. 7. e7702. 10.7717/peerj.7702.
Lokpo, Sylvester & Laryea, Roger & Osei-Yeboah, James & Owiredu, William & Ephraim, Richard & Adejumo, Esther & Ametepe, Samuel & Appiah, Michael & Nogo, Peter & Affrim, Patrick & Precious Kwablah, Kwadzokpui & Abeka, Ohene Kweku. (2022). The pattern of dyslipidaemia and factors associated with elevated levels of non-HDL-cholesterol among patients with type 2 diabetes mellitus in the Ho municipality: A cross sectional study. Heliyon. 8. e10279. 10.1016/j.heliyon.2022.e10279.
Verbeek, Rutger & Hoogeveen, Renate & Langsted, Anne & Stiekema, Lotte & Verweij, Simone & Hovingh, G. Kees & Wareham, Nicholas & Khaw, Kay-Tee & Boekholdt, S & Nordestgaard, Børge & Stroes, Erik. (2018). Cardiovascular disease risk associated with elevated lipoprotein(a) attenuates at low low-density lipoprotein cholesterol levels in a primary prevention setting. European heart journal. 39. 10.1093/eurheartj/ehy334.
Shao, Zeguo & Xiang, Yuhong & Zhu, Yingchao & Fan, Aiqin & Zhang, Peng. (2020). Influences of Daily Life Habits on Risk Factors of Stroke Based on Decision Tree and Correlation Matrix. Computational and Mathematical Methods in Medicine. 2020. 1-12. 10.1155/2020/3217356.
Schmidt, Gilda & Schneider, Christina & Gerlinger, Christoph & Endrikat, Jan & Gabriel, Lena & Ströder, Russalina & Müller, Carolin & Juhasz-Böss, Ingolf & Solomayer, Erich-Franz. (2020). Impact of body mass index, smoking habit, alcohol consumption, physical activity and parity on disease course of women with triple-negative breast cancer. Archives of Gynecology and Obstetrics. 301. 10.1007/s00404-019-05413-4.
Chua, Shiao & Yovich, Steven & Hinchliffe, Peter & Yovich, John. (2023). Male Clinical Parameters (Age, Stature, Weight, Body Mass Index, Smoking History, Alcohol Consumption) Bear Minimal Relationship to the Level of Sperm DNA Fragmentation. Journal of Personalized Medicine. 13. 759. 10.3390/jpm13050759.
Kuan, Valerie & Warwick, Alasdair & Hingorani, Aroon & Tufail, Adnan & Cipriani, Valentina & Burgess, Stephen & Sofat, Reecha & Fritsche, Lars & Igl, Wilmar & Cooke Bailey, Jessica & Grassmann, Felix & Sengupta, Sebanti & Bragg-Gresham, Jennifer & Burdon, Kathryn & Hebbring, Scott & Wen, Cindy & Gorski, Mathias & Kim, Ivana & Cho, David & Heid, Iris. (2021). Association of Smoking, Alcohol Consumption, Blood Pressure, Body Mass Index, and Glycemic Risk Factors With Age-Related Macular Degeneration: A Mendelian Randomization Study. JAMA Ophthalmology. 139. 10.1001/jamaophthalmol.2021.4601.
Oh, Gyu & Ko, Taehoon & Kim, Jin-Hyu & Lee, Min & Choi, Sae & Bae, Ye & Kim, Kyung & Lee, Hae-Young. (2022). Estimation of low-density lipoprotein cholesterol levels using machine learning. International Journal of Cardiology. 352. 10.1016/j.ijcard.2022.01.029.
Garcia-D'Urso, Nahuel & Climent i Pérez, Pau & Sanchez, Miriam & Martí, Ana & Guilló, Andrés & Azorin-Lopez, Jorge. (2022). A Non-Invasive Approach for Total Cholesterol Level Prediction Using Machine Learning. 10.1109/ACCESS.2022.3178419.
Liao, Pen-Chih & Chen, Ming-Shu & Jhou, Mao-Jhen & Chen, Tsan-Chi & Yang, Chih-Te & Lu, Chi-Jie. (2022). Integrating Health Data-Driven Machine Learning Algorithms to Evaluate Risk Factors of Early Stage Hypertension at Different Levels of HDL and LDL Cholesterol. Diagnostics. 12. 1965. 10.3390/diagnostics12081965.
Krentz, Andrew & Haddon-Hill, Gabe & Zou, Xiaoyan & Pankova, Natalie & Jaun, André. (2023). Machine Learning Applied to Cholesterol-Lowering Pharmacotherapy: Proof-of-Concept in High-Risk Patients Treated in Primary Care. Metabolic syndrome and related disorders. 21. 10.1089/met.2023.0009.
Hidekazu, Ishida & Nagasawa, Hiroki & Yamamoto, Yasuko & Fujigaki, Hidetsugu & Doi, Hiroki & Saito, Midori & Ishihara, Yuya & Fujita, Takashi & Ishida, Mariko & Kato, Yohei & Kikuchi, Ryosuke & Matsunami, Hidetoshi & Takemura, Masao & Ito, Hiroyasu & Saito, Kuniaki. (2023). Dataset dependency of low-density lipoprotein-cholesterol estimation by machine learning. Annals of clinical biochemistry. 45632231180408. 10.1177/00045632231180408.
Uysal, Ilhan & Caliskan, Cafer. (2023). Prediction of VLDL Cholesterol Value with Interpretable Machine Learning Techniques. 10.1007/978-3-031-08637-3_6.
Chaudhuri, Avijit. (2023). Prediction of Blood Pressure and Cholesterol By Machine Learning Technique. international journal of engineering technology and management sciences. 7. 10.46647/ijetms.2023.v07i02.007.
.R, Karthikeyan & Geetha, P & E., Ramaraj & Ar, Karthikeyan. (2022). Prediction Of Diabetes And Cholesterol Diseases Based On Ensemble Learning Techniques. 9. 491.
Nath Boruah, Arpita & Biswas, Saroj & Bandyopadhyay, Sivaji. (2022). Transparent rule generator random forest (TRG-RF): an interpretable random forest. Evolving Systems. 14. 10.1007/s12530-022-09434-4.
Latif, Sohaib & Fang, Xian & Arshid, Kaleem & Almuhaimeed, Abdullah & Imran, Azhar & Alghamdi, Mansoor. (2023). Analysis of Birth Data using Ensemble Modeling Techniques. Applied Artificial Intelligence. 37. 10.1080/08839514.2022.2158273.
Dissanayake, Kaushalya & Md Johar, Md Gapar. (2023). Two-level boosting classifiers ensemble based on feature selection for heart disease prediction. Indonesian Journal of Electrical Engineering and Computer Science. 32. 381-391. 10.11591/ijeecs.v32.i1.pp381-391.
Ahmed, Md & Shefaq, Fatima. (2022). A Study on Machine Learning and Supervised and Deep Learning Algorithms to Predict the Risk of Patients: Ten Year Coronary Heart Disease. International Journal of Privacy and Health Information Management. 9. 12. 10.4018/IJPHIMT.305127.
Zapata, Ruben & Huang, Shu & Morris, Earl & Wang, Chang & Harle, Christopher & Magoc, Tanja & Mardini, Mamoun & Loftus, Tyler & Modave, Francois. (2023). Machine learning-based prediction models for home discharge in patients with COVID-19: Development and evaluation using electronic health records. PLOS ONE. 18. e0292888. 10.1371/journal.pone.0292888.
Handa, Disha & Saraswat, Kajal. (2022). Comparative Analysis of KNN Classifier with K-Fold Cross-Validation in Acoustic-Based Gender Recognition.
Pal, Osim. (2021). Skin Disease Classification: A Comparative Analysis of K-Nearest Neighbors (KNN) and Random Forest Algorithm. 1-5. 10.1109/ICECIT54077.2021.9641120.
Wernigg, Robert & Wernigg, M.. (2022). A case study for assessing the utility of a decision tree based learning algorithm in mental health inpatient care quality management. European Psychiatry. 65. S171-S171. 10.1192/j.eurpsy.2022.454.
Mustamin, Nurul & Aziz, Firman & Firmansyah, Firmansyah & Ishak, Pertiwi. (2023). Classification Of Maternal Health Risk Using Three Models Naive Bayes Method. IJCCS (Indonesian Journal of Computing and Cybernetics Systems). 17. 395. 10.22146/ijccs.84242.
AL-Shamdeen, Muna & Ramo, Fawziya. (2024). PERFORMANCE EVALUATION FOR FACE MASK DETECTION BASED ON MULT MODIFICATION OF YOLOV8 ARCHITECTUREOCENA WYDAJNOŚCI WYKRYWANIA MASKI NA TWARZY NA PODSTAWIE WIELU MODYFIKACJI ARCHITEKTURY YOLOV8. Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska. 14. 89-95. 10.35784/iapgos.6056.
Dayal, Karan & Shukla, Manmohan & Mahapatra, Satyasundara. (2023). Disease Prediction Using a Modified Multi-Layer Perceptron Algorithm in Diabetes. EAI Endorsed Transactions on Pervasive Health and Technology. 9. 10.4108/eetpht.9.3926.
Tsao, Connie & Vasan, Ramachandran. (2015). Cohort Profile: The Framingham Heart Study (FHS): Overview of milestones in cardiovascular epidemiology. International Journal of Epidemiology. 44. 1800-1813. 10.1093/ije/dyv337.
Rustamov, Zahiriddin & Rustamov, Jaloliddin & Zaki, Nazar & Turaev, Sherzod & Sultana, Most & Tan, Jeanne & Balakrishnan, Vimala. (2023). Enhancing Cardiovascular Disease Prediction: A Domain Knowledge-Based Feature Selection and Stacked Ensemble Machine Learning Approach. 10.21203/rs.3.rs-3068941/v1.
“Cholesterol levels: Medlineplus medical test,” MedlinePlus, https://medlineplus.gov/lab-tests/cholesterol-levels/ (accessed Nov. 6, 2023).
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Yahya Albugg

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.








