Cardiovascular Disease Detection Using MRI Data with Deep Learning Approach

Cardiovascular disease prediction is very critical area of research. In this work we tried to measure the left ventricular volume which plays an important role in cardiac arrests. In this work we contributed in data pre-processing of the CMR images then applied deep neural network. The data used in this work was Sunnybrook Cardiac Dataset (SCD) and Cardiac Atlas Project (CAP). It is LV CMR images dataset. Convolution neural network and maxpooling with ADAM activation function was applied for the proposed architecture. The results are at very initial stages and further enhancement could be done in the future by applying more efficient pre-processing techniques.


Introduction
The approximate number of people died because of cardiovascular disease (CVD) is estimated to be around 17.7 million as of 2015 statistics, which is equal to almost 31% of the whole population of the globe [1]. The rate of death is continuously increasing annually because of CVD than compared to any other disease. This disease is related to imaging technology where the images are processed and based on the development of medical imaging numerous options have emerged to solve this problem with various options like non-invasive investigation of CVD comprising of echocardiography, computed tomography (CT), cardiovascular magnetic resonance (CMR) etc. each technology shown above has their advantages and disadvantages. Among many techniques CMR is considered as the most efficient of all because of its good quality images and contrast for soft tissues and nonexistence of ionising radiation, CMR has established itself as the non-invasive gold standard for assessing cardiac chamber volume and mass for a wide range of CVD [2]- [4].
Researchers are working hard to develop a system which could reduce the number of people die because of CVD for that left ventricle (LV) is considered as the core research area on the heart image and applying contour detection (segmentation) techniques on this particular area would help in successfully detecting the CVD. The major portion of the heart which is responsible for preserving the ejection function is the LV as it is the biggest chamber in the heart. To extract the LV from CMR images few critical parameters are required which are ejection fraction (EF), systolic and diastolic volumes, and myocardial mass.
CARDIAC diseases are leading causes of death in the world [5]. The left ventricle (LV) is the biggest chamber in heart, which plays an important role in maintaining the ejection function of heart. The diagnosis of the cardiac disease is done with the help of critical index of ejection fraction (EF). The whole experimentation is done on left ventricular EF because it is the main source of detecting the accurate estimation of LV volumes, the LV comprises of end-diastole volumes (EDV) and end-systole volumes (ESV). Estimating the volume of LV is the considered as the main criteria for detecting the CVD. LV segmentation technology is applied to estimate the volume on cardiac magnetic resonance (CMR) images. CMR images are considered as the gold standard modality for diagnosing the cardiac disease because of its high refinement between soft tissues and invasive exercise standards [6]. Based on this advantages CMR images are used in most of the cardiac medical image processing tasks particularly in detecting automatic LV segmentation tasks [7]. But due to the inherent characteristics of CMR images it is still an open challenge to detection the LV segmentation automatically [8], such as higher noise, intensity level inhomogeneity, effect of partial volume [9], [10], complex topological structures, and great variability across different slices. ANALYSIS of cardiac function plays an important role in clinical cardiology for patient management, disease diagnosis, risk evaluation, and therapy decision [11]. Diagnosis of cardiac disease is done by applying image processing techniques to CMR images, thanks to digital imagery which helps in accessing a set of matching indices calculated from various structures of the heart for the accurate diagnosis of cardiac. CMR images are very much used for cardiac diagnosis because of its assessment in the left and right ventricular ejection fractions (EF) and stroke volumes (SV), the left ventricle mass and the myocardium thickness. To calculate the above factors, the experimentation needs accurate descriptions of the following instances like left ventricular endocardium and epicardium, and of the right ventricular endocardium for both end diastolic (ED) and end systolic (ES) phase instances. Since lack of these advanced techniques and the accuracy fluctuation of these fully automatics cardiac segmentation techniques, still the clinics are practising the manual semi-automatic segmentation in the usual daily practice. This semi-automatic and manual segmentation is time taking and also liable to intra and inter-observer variability [12]. The difficulties of CMR segmentation have been clearly identified [13].Recently machine learning algorithms are proving to have potential ability in visual tasks like object recognition in natural images [14], specially deep learning is attaining high performance accuracy compared to human manual performance, it is actively applied for CMR image analysis [18]- [20], in Go game playing [15], skin cancer classification [16] and ocular image analysis [17].
In this work we have proposed a framework to report the problem of estimating the LV volume: 1) We proposed a data pre-processing method to achieve data normalization and automatic LV detection. 2) We designed a robust deep convolutional neural network to achieve accurate LV volumes estimation.
The remainder of this paper is organized as follows. The Section 2 highlights the literature review which list all the methods being applied for this task. The Section 3 gives the details about the dataset which is being used for experimentation. The Section 4 describes the details of the deep learning concepts, Section 5 gives a brief description about the proposed framework, Section 6 describes the model description, Section 7 describes the experimental results, Section 8 is a discussion and finally Section 9 concludes with a brief conclusion of the work.

Literature Review
If we see the LV segmentation techniques being developed before 2011 then it would be classified into 4 categories. 1) First comes the methods which are based on traditional image processing like thresholds [9], [18], dynamic programming [19], registration based methods [20], and graph based methods [21]. 2) Methods which are based on deformable models like snake and level set [22]- [25].
3) The methods based on pixel classification, e.g., cluster [26], neural networks [27] and Gaussian mixture model [28]. 4) The methods based on past statistical evidence, for example, the active shape model (ASM) [29], the active appearance model (AAM) [30], [31] and the atlas model [32]. Also in [7] a detailed description is listed for all the techniques developed for segmentation of CMR images before 2011. Following the methods developed in 2011 afterwards new techniques evolved which used the hybrid approach combining two technique to develop a new model like for example [33] projected an LV segmentation method created on topological stable-state threshold combined with region restricted dynamic programming. In [34] used Gaussian-mixture model and region restricted dynamic programming to segment LV in CMR images. Further to these techniques than came the deep learning approach to solve the LV segmentation task, combined with deformable model was adopted to address LV segmentation problem [35]. Then more advanced techniques evolved relating to neural networks like end-toend semantic segmentation approaches, such as fully convolutional neural networks (FCN) [36] and U-net [37], were also applied in various of medical image segmentation tasks [38], these developed techniques display the power of deep learning technology for clinical practise. Some more techniques were developed for the segmentation of LV, like guided random walks method [39], the convex relaxed distribution matching method [40], and the mutual context information method [41]. However, in the current practical scenario most of the clinical diagnosis system working for the LV segmentation are still using the semi-automatics approach which is leading to more computation and time taking and also laborious for doctors to analyse huge records of CMR images. Additionally, a method was developed to estimate the LV volumes directly, the training and validation of this model was done with 100 subjects , this method was based on multiscale deep belief networks and regression forests [42]. Apart from direct and automatics methods several other methods were also developed which were based on semi-automatic approach which includes : fuzzy clustering [43], probability atlas [44], deformable model [45], dynamic programming [46], variational and level sets [47], active shape model [48] graph cuts [49], active appearance model [31].

Data
The dataset used in the research was a combination of two publicly available sources, Sunnybrook Cardiac Dataset (SCD) [50] and the Cardiac Atlas Project (CAP) [51] it consisted of a total of 140 patient's MRI images which was used for both training and evaluation of the proposed model. The SCD had 45 MRI images with the combination of patients with following classes like healthy, hypertrophy, heart failure with infarction and heart failure without infarction, this dataset is also called as the 2009 Cardiac MR Left Ventricle Segmentation Challenge data. Few of the images from this dataset was also used in another challenge called automated myocardium segmentation which was held at MICCAI workshop in 2009. Now the complete dataset is available in cardia. This dataset consists of 45 CMR images which was acquired with the following details, it was developed as cine steady state free precession (SSFP) MR short axis (SAX) with the general sigma MRI as 1.5 T. The images were attained with the breadth holding duration of about 10-15 s and the temporal resolution from 20 cardiac phases above the heart cycle, the mean spatial resolution is 1.36 ×136 ×9.04 mm (255 ×255 ×11 pixels), which was scanned for the ED phase.
The cardiac Atlas Project consists of 95 CMR images from patients with diseases related coronary artery diseases and also dysfunctioning of left ventricular from mild to moderate level. The acquisition of the images with respect to SSFP is done with the breath-hold of 8-15 s duration with a typical thickness ≤10 mm, gap ≤2 mm, TR 30-50 ms, TE 1.6 ms, flip angle 60 °, FOV 360 mm, and mean spatial resolution of 1.48 ×1.48 ×9.3 mm (245 ×257 ×12 pixels). The whole heart is covered in SAX with acquisition of short-axis slices.

Deep Learning
Deep learning is a subclass in the machine learning techniques. It has an exceptional form of learning which is based on representation, where the learning is done by constructing networks and features are consecutively stored into these networks. There are layers of these networks each doing a separate task. These layers are called hidden layers consisting of numerous neurons. Artificial Neural Network (ANN) lead to the discovery of the word "deep" because of the numerous hidden layers. The brain and its biological functioning became the source for the development of ANN algorithm models. The representation of the model is a kind of structure shown in Fig. 5 which is made up of input, hidden and output layers. All the neurons in the network are connected to each other and also the neurons from other layer with the connection link. The following are the components of the nerve cell: dendrites (input), axon (output), nucleus (Activation function), and finally synapses (weights) as shown in Fig. 1. The main part of the biological neuron is the nucleus where the exact operation takes place, activation function is the similar kind of position with same functionality in the artificial neuron whereas the input signal and corresponding weight of the model is represented in the Fig. 1 as dendrites and synapses respectively. Since ANN has some deficiency because of its deviation in shift and accessibility to translation which is leading to performance declining in classification. To overcome these issues and further advanced version of ANN is developed called Convolutional Neural Network(CNN). This advanced architecture guarantees shift in variance and also translation. The actual structure of the CNN is illustrated in Fig. 2. The actual concept of the architecture is feed forward which consists of convolution, pooling, and fully-connected, layers. They are briefly explained be-low.

Convolution Layer:
All the data are first fed into this layer where the sample is being convolved with a particular kernel (weight). After convolving through the sample the result is a feature map which is the actual outcome of this layer. The amount of convolution happening on the input sample through the kernel is actually controlled by stride. Actual feature extraction process is done at convolution operation by learning from the input signal. The following layers take the features as their input to further work on these features for classification purpose.

International Journal of Computer Electrical Engineering
Fully connected layer if represented graphically in the Fig. 3. All the neuron in this layer are connected with a specific weight. The exact output target is estimated by establishing weighted sum from all the previous layers Fig. 3. Fully connected layer.

The Proposed Method
The first think which was done on the data was pre-processing as a general and usual step in research, this would help not only in enhancing the accuracy of prediction but also advances the simplification ability of the proposed framework for robustness to large scale data. The robustness of the system has influence from various factors like scanning parameters, and other factors like age different types of gender and the level of health conditions. In this phase we have proposed a precise and robust ROI detection technique. Also in this phase normalization technique was applied to manage the variance in pixel space and the level of intensity for the large CMR datasets.
After pre-processing followed design of network for deep learning, in this phase we assessed the architecture of the well-known convolution networks and enhanced the VGG by adapting few of the very much effective technologies like batch normalization "Adam" training and also dropout. As we know there is always a relationship between the final accuracy of prediction and the number of fields applied in the initial stage of the convolution layer, this relationship and its effects helped us in further advancing in designing an efficient CNN. Finally predicting the volume when compared to the techniques based on segmentation, the proposed method gave a superior result for robustness with respect to the variances. The framework for the proposed method for prediction of LV volume is clearly described pictorially in Fig. 4a and Fig. 4b. Analogous to the usual deep learning flow, this framework also comprises of three main

International Journal of Computer Electrical Engineering
parts: data pre-processing, training of the model and prediction of the volumes based on CNN, and ultimately computation of EF. Further to our framework few more setting is being done which is considered as our core contribution to this research with respect to data pre-processing technique especially for normalization and detection of region of interest (ROI), designing of deep CNN network.

Model Design
Research in different application has led to the development of various network architectures and this is because of the development of deep learning technology. The following are the three basic architectures for classification with regression for most of the advanced deep learning techniques these are VGG-net [52], Google-net [53], and Residual-net [54] architectures. VGG-net is designed by following the customary CNN style, it comprises of convolution layers, pooling and fully connected layers successively, then Google-net has some modifications as it enhances the architecture by including multi-scale convolution layers. Residual-net implements a uniqueness mapping technique to make networks join fast and escape the gradient vanishing. In this research we assessed the three basic networks and then projected an enhanced architecture as shown in Fig. 5.
In this work we framed a conventional classification networks into the end-to-end regression networks and then finally swapped the softmax activation function in the final FC layer with rectified linear units f(x) = max(0, x) (Maxpooling).

Loss = √∑
The number of training example is represented by N, the value predicted for ith example is represented by Xi and the ground truth if the ith example is expressed as Yi.
The model was evaluated based on RMSE and MRMSE, and further it was validated with minimum loss. The dataset was divided into 100 images for training and the remaining 40 for model validation. The following were the setting done in the proposed method for experimentation in predicting the volume. The learning rate for "Adam" was set as 0.0001, and the size of the batch was set as 64, for training the number of epochs were set as 1000 for EDV or ESV prediction. After setting all these parameters further to build an efficient model we made evaluation for the model based on two phases. In the initial phase we evaluated the above discussed three different and well-known basic type of CNN architectures (the 19 layers VGG [36], Google-net [37], and the 50 layers Residual-net (Resnet 50) [38]) with the added single node layer in the end. Based on the evaluation in the first phase we enhanced the design of the nominated CBB architecture

International Journal of Computer Electrical Engineering
in the following second phase. We assessed the outcome of the prediction by establishing the relation between combined different CNN models.

Experiments and Results
The experimental results are discussed in this section and it is quite evident that the proposed method has yielded better results in predicting the volume of LV. In the experimentation process we have trained the proposed CNN architecture of three runs to produce three different models for prediction on the same architecture, and then ultimately the mean is calculated and the values was decided as the final prediction results from those three models. The results were declared for the prediction of volume in terms of ESV, EDV and EF. Finally, the proposed method attained highest accuracy as RMSE ± AESD in ESV, EDV, and EF are 6.3 ± 2.46 (ml), 6.6 ± 3.2 (ml), and 0.0337 ± 0.012 respectively (P<0.001). The correlation graphs in Fig. 6(a), (c), and (e) show the correlation between the ground truth and predicted results. The correlation coefficients (R) achieve 0.776, 0.763, and 0.632 respectively.

Discussion and Conclusion
In this research we have implemented a direct LV volume assessment technique established on end-toend DL technology. The major improvement in the proposed technique is the detection of LV and CNN architecture. As done in the traditional method with segmentation as the tool for LV prediction we have proposed end-to-end regression method for predicting LV which is considered as very difficult for CMR image segmentation tasks. Based on the above results it could be said that the proposed method improved the LV volume prediction for accuracy, robustness and efficiency. Left Ventricular estimation is considered as the most important factor in cardiac disease, measuring its volume can help in predicting the disease and further cure. In this work we applied pre-processing technique on cardiac image datasets Sunnybrook Cardiac Dataset (SCD) and Cardiac Atlas Project (CAP). Convolution neural networks were applied on these datasets. But the results were not so promising and we are expecting to enhance the prediction and accuracy by applying further pre-processing techniques which could help in increasing the processing speed and also accuracy.