Neuro-Fuzzy System for Heart Failure Prediction

Heart failure is one of the dangerous heart diseases that infect humans and may cause death. This disease causes damage to the heart muscle, and it becomes unable to pump blood in the body as well as it should. Therefore, the condition of heart failure patients must be predicted as soon as possible in order to help the patients to live longer lives by offering appropriate therapy. Based on that, the aims of this paper are to use medical records to predict the state of a patient with heart failure if he/she will die or not, and extract the important features that have a direct effect on the patient's state. This paper used a dataset of 299 heart failure patients and applied the Neuro-Fuzzy systems (NFS) to this dataset. This prediction is made by testing each two feature together in the dataset and feeding it to the NFS system to determine its effect on the patients. In this paper, the accuracy and confusion matrix is used to evaluate the system's performance. The experimental results show that the system yielded 100% accuracy when the two-feature, serum creatinine and ejection fraction, are tested together, so it can be used alone to predict whether patients with heart failure will survive.


Introduction
Heart failure is one of the severe diseases that kills millions of people around the world [1].According to the "Centers for Disease Control and Prevention", more than 6 million people in the United States suffer from heart failure.In addition, heart failure is not limited to adults but includes children."Heart failure occurs when the heart muscle cannot be able to pump blood around the body properly" [2].There are many causes that lead to heart failure, such as ("Heart attack, High blood pressure (hypertension), Faulty heart valves, Cardiomyopathy (a disease of the heart muscle), Inherited heart problems (congenital heart problems), Abnormal heart rhythms (arrhythmias) and Viral infection in the heart muscle") [3].
Medical records are considered a very informative resource since they reveal the hidden relationships between patients' data.The diagnosis of heart failure disease (HF) is dependent on examinations, the interpretation of which is time-consuming and often limited by the level of clinical experience of doctors.With the development of computer science and the artificial intelligence (AI) technique and as well as the big data presence, technological innovations have occurred in the field of heart medicine, where AI has been used to assist precise classification of the state of the patients with heart failure by integrating a variety of medical data [4].In this study, we try to identify the risk factors or the important features in the dataset.This process is done by taking each two features in the dataset and testing them together to determine their effect by monitoring the accuracy after testing every two features in the dataset together.The features with the highest accuracy are considered the most important features in the dataset.After that, we used these important features only and fed them to the Neuro-Fuzzy System (NFS) again to predict the patient's state.To our knowledge, this study is the first study in this field to predict the heart failure patient's state by using this data set with the NFS system.

Literature Review
Several studies have been presented in the field of heart failure prediction.Therefore, in this part, we will present some of these studies that focused on predicting heart failure using nuro-fuzzy system techniques and artificial neural networks.The authors in [5] The authors in made predictions about the survival of heart failure patients based on the heart failure clinical health record.They used the ANFIS system and the Genetic Algorithm to improve the performance of the ANFIS classification by the Genetic Algorithm through the attribute selection process.The experimental result shows the highest ANFIS accuracy is produces 94.444%.While the ANFIS algorithm, after features selection using the Genetic Algorithm, has the highest accuracy of 96.667%.In [6] suggested a machine learning methodology that uses an artificial neural network (ANN) to predict heart failure.They also suggested a unique wrapper-based feature selection method using a grey wolf optimization (GWO) to cut down on the number of necessary input attributes.Computing findings demonstrate that far fewer features are required, and higher prediction accuracy of about 87% may be attained.The authors of [7] presents a model for the classification of medical diseases.The crisp inputs are fuzzified by using linguistic variables of the membership function to deal with imprecise input data.The model becomes complex because of the fuzzification process.These expanded fuzzified values are passed to the model to extract the relevant features that have a significant effect on the result.The diseases will be classified by passing the reduced features again to the ANN-based model.In [8] the authors in his study proposes an Adaptive Neuro-Fuzzy Inference Systems (ANFIS) to diagnose heart disease.The genetic algorithm is applied to optimize the parameters related to membership functions in ANFIS.The experiment was conducted on the public UCI heart disease datasets.The experimental result shows that 91.25% accuracy was obtained on the testing set.Table 1 illustrates the related works on the prediction heart failure using various methods.

Neuro-Fuzzy System (NFS)
Neuro-Fuzzy system is a term that refers to a models that combine between two techniques which are: neural networks and fuzzy systems [9].NFS is one of the hybrid systems that result from explicitly combining the fuzzy systems' transparency and generalization capabilities of the dynamic behavior of the neural networks [10].This hybrid system is one of the most commonly used systems in difficult real-life problems like system identification problem.Therefore, the researcher derived to find new methods to simplify the design and to get high-speed convergence of dynamic systems.One of these methods is called the Neuro-fuzzy system.This method can choose the network weights by the fuzzy (IF_THEN) rules.The universal approximation capability of fuzzy systems, explain their successful application on many complex fields, in which dynamically changing knowledge base is needed.Identification and modeling of nonlinear dynamic systems, and adaptive predication of time series [11]., one of the first works was proposed by Keller and Hunt, where a combination of neural network learning methods with the concepts of fuzzy systems was proposed [12].They proposed an approach to improve the stability in the learning algorithm of classification problems using fuzzy techniques.They introduced a fuzzy membership of data items to the searched classes to improve the convergence of the learning algorithm.The architecture of a NFS system is shown in Fig. 1.

Research
Year Classifiers for heart failure prediction Best Accuracy [5] 2021 They used the ANFIS system and the Genetic Algorithm to improve the performance of the ANFIS classification by the Genetic Algorithm through the attribute selection process.

96.667%.
[6] 2021 They used an artificial neural network (ANN) to predict heart failure.They also proposed a novel wrapper-based feature selection by the GWO to reduce the number of features.

87% [7] 2020
They presents a model for the classification of medical diseases.The crisp inputs are fuzzified by using linguistic variables of the membership function to deal with imprecise input data.These expanded fuzzified values are passed to the model to extract the relevant features that have a significant effect on the result.The diseases will be classified by passing the reduced features again to the ANN-based model The authors in his study proposes an Adaptive Neuro-Fuzzy Inference Systems (ANFIS) to diagnose heart disease.The genetic algorithm is applied to optimize the parameters related to membership functions in ANFIS.
NFS system consist of five layers.It describe, as follows: 1.
Input Layer: This layer receives crisp inputs, which is the features of the dataset.

2.
Fuzzification Layer: The crisp values are Fuzzification in this layer; the process done by transforming each feature of the input is into its corresponding membership values.Here, the Gaussian membership function with three linguistic variables is used: (low, medium, and high) to compute the membership values of each feature of the input pattern.

3.
Rule Layer: The firing strength of each rule is computed in every node of this layer.

4.
Normalization Layer: The normalized firing strength is computed in this layer.

5.
Consequent Layer: The contribution of each rule in the NFM final result is reflected in each consequent node in the Consequent Layer [13].

Evaluating Metrics
Many evaluating metrics, in this paper will be used, it will explained as follows:

Confusion Matrix
A confusion matrix is a technique that shows the performance of a classification model.A confusion matrix is drawn as a table representing the results, including the correct and incorrect classified samples.It is used for binary classification [14].A (2*2) matrix is employed to describe the confusion matrix [14].Table 2 illustrates the confusion matrix .

Accuracy
Accuracy is considered a widely metric.It is the represent ratio of correctly categorized samples to the total number of samples for a given test data set , and it is denoted mathematically [16], [17] as:

The Proposed System
The proposed system will be discussed in this part.Our system contains four steps: the first is the data collection, the second step is data pre-processing, which involves data splitting and feature scaling, the third step is the feature extraction and the last step is the prediction step.The following paragraphs will explain each step in detail.

Data Collection Step
The dataset is explained in this step.The "Heart Failure dataset 2015" is used in this study.Available at the University of California, Irvine (UCI).This dataset consists of 13 columns and 299 rows.It represents the medical records of patients with heart failure.Table 2.
It is essential to mention that the dataset is imbalanced as there are 203 positive samples (death event = 0) and 96 negatives (death event = 1).[target] death event "If the patient died during the followup period" Boolean 0,1

Data pre-processing
This step includes two stages as follows:

Data Splitting
In the splitting part, the dataset is divided into two separate sets: the training set and the testing.The dataset was split in a numeric manner into (training data=235) and (testing data=64).

Feature scaling
The dataset contains variables with widely varying scales, which necessitates the use of this technique.As a result, the dataset needs to be convert to a scaled form that can be understood by Neuro-Fuzzy system.
So, the feature vectors will be scaled.In this work, the MinMaxScaler was used to scale each input variable separately to the range 0-1.Mathematically, Equation ( 2) is used to represent it. Where: 1.  = the scaled value of the feature.

Feature extraction
In this step, the essential features in the dataset are selected and used to predict the patient's state.This process begins with testing every two features in the dataset together to determine how much they impact on the system's accuracy.This is done by monitoring the accuracy after each test.Lastly, the features that achieve the highest accuracy prediction will be considered the most important features in the dataset, as it directly affects on the patient's state, and these features can be used alone to predict whether the patients with heart failure will survive or not.

Prediction
In this step, the features that we obtained from the feature extraction step are used to prediction the patient's state This process done by feed these features alone as a input to the NFS again.

Experimental Result
This part presents the results collected from the ANFIS model after every two features are tested together.The performance of the NFS is evaluated with many metrics, including testing accuracy, and the confusion matrix, which will be represented for only the most important features in the dataset.    .

Discussion
This part will discuss the experimental result of the proposed system.After all the experiments, we found many features give high prediction accuracy when we test them together.Still, what we found is that when the features (ejection fraction, serum creatinine) are tested together, a 100% prediction accuracy is obtained.As a result, these features are considered the most important features in the dataset and can be used alone to predict the patient's state.Therefore, doctors could still forecast patient survival by merely looking at the "ejection fraction and serum creatinine" rates in the patient's electronic health record

Conclusion
Heart failure is a very dangerous disease that may lead to death.Early detection of heart failure will help to save the patient's life.Therefore, this paper aims to design a model that manually selects the most important features from the dataset and predicts the patient's state.This prediction is made by using Neuro-Fuzzy System (NFS).The dataset used in this study represents the "medical records" of patients with heart failure, consisting of 12 features.All features were used independently from the time feature.Every two features in the dataset are entered manually into the NFS model and monitored to know which features have the major impact on the accuracy after they are tested together.The experimental results show that the "serum creatinine and ejection fraction" when tested together (1.00%) accuracy is obtained.Therefore, these features are considered the most important features in the dataset and can be used alone to predict the patient's state if he/she will die or not.

Table 2 -
The confusion matrix.• True Positive (TP): It represents the positive samples that are correctly identified by the model.• False Negative (FN): It represents the positive samples that are wrongly classified by the model.• False Positive (FP): It represents the negative sample mistakenly classified by the model.• True Negative (TN): It represents the negative samples correctly classified by the model [15].

Fig. 2 -
Fig. 2 -The proposed System for Heart Failure Prediction.

Fig. 3 -
Fig. 3 -The confusion matrix for testing ejection fraction and serum creatinine.

Fig. 4 -
Fig. 4 -Illustrate the training loss and validation loss across epochs for testing ejection fraction and serum creatinine.

Table 2 : Describes the elements in the dataset.
Table 3 illustrates the NFS accuracy results based tested every two features from the dataset