Title: Classification of Congestive Heart Failure from ECG Data

Authors Zeming Liu (zliu236), Jinqian Li (jli633), Yujin Chung (ychung36)

Final reflection: please see the 'additional information', or you can review at https://www.overleaf.com/read/bfzckdkpmfky#ef1489

Our dataset: For CHF patient: 1. https://physionet.org/content/chfdb/1.0.0/ 2.https://physionet.org/content/chf2db/1.0.0/ For the normal group: https://physionet.org/content/nsr2db/1.0.0/ The preprocessed data is included in the "Additional Info" section of this submission.

Code/Preprocessed Data link: https://github.com/yujin-ch/CHF-Classification

Notice: we put the Final reflection and Preprocessed Data together at the 'additional information', thank you!

Congestive heart failure (CHF) is a chronic cardiovascular condition associated with dysfunction of the autonomic nervous system (ANS). Heart rate variability (HRV) has been widely used to assess ANS. The traditional diagnosis methods based on analyzing the electrocardiogram (ECG) are time-consuming and laborious, and the interpretation of the results is subjective. Hence, inspired by Zou et al [1], 1D-MsNet, which consists of a Multiscale module and a Conv1D block with weight sharing to extract features within each one-minute sample itself, has achieved excellent results in Sleep apnea syndrome (SAS) patient detecting. Hence, our contribution of this project mainly seperates into 3 parts: Firstly, we reproduced the private code for 1D-MsNet based on tensorflow. Then, because the dataset of the paper is also private, we changed the input from PPG data of SAS patients into ECG data of CHF patients and the normal through the open source website name physionet. Finally, we compared the result with one of the group member's (Zeming) previous classical nonlinear screening method [2].

[1] Lang Zou and Guanzheng Liu. Multiscale bidirectional temporal convolutional network for sleep apnea detection based on wearable photoplethysmography bracelet. IEEE Journal of Biomedical and Health Informatics, 2023.

[2] Zeming Liu, Tian Chen, et al. Similarity changes analysis for heart rate fluctuation regularity as a new screening method for congestive heart failure. Entropy, 23(12):1669, 2021.8


Check-in 3:

Introduction

What problem are you trying to solve and why?

The problem being addressed is the diagnosis of congestive heart failure (CHF) using electrocardiogram (ECG) data. This condition involves dysfunction of the autonomic nervous system (ANS), and traditional diagnostic methods that analyze ECG data are often time-consuming, labor-intensive, and can be subjective in interpretation. Given these limitations, there is a need for more efficient and objective diagnostic approaches.

The project is inspired by previous work using the 1D-MsNet architecture, which demonstrated effective results in detecting sleep apnea syndrome (SAS) by analyzing photoplethysmogram (PPG) data. The goal is to adapt this approach to CHF diagnosis by modifying the network to work with ECG data, which is a different type of cardiovascular signal, and by validating its effectiveness on an open-source dataset from PhysioNet. This approach aims to provide a faster, automated method for diagnosing CHF that is less dependent on manual effort and subjective interpretation, potentially leading to more consistent and accessible CHF screening.

If you are implementing an existing paper, describe the paper’s objectives and why you chose this paper

Yes, we want to implement an existing paper "Multiscale Bidirectional Temporal Convolutional Network for Sleep Apnea Detection Based on Wearable Photoplethysmography Bracelet". It is to develop an unobtrusive method using wearable devices to diagnose and treat Sleep Apnea Syndrome (SAS) early.

The choice of this paper is motivated by our collective interest in how technology can aid in universalizing the application of medical diagnostic techniques that might otherwise be limited by traditional, labor-intensive methods. Specifically, the use of the 1D-MsNet model represents an innovative approach in the field of medical diagnostics by potentially transforming the way congestive heart failure (CHF) is diagnosed. By leveraging this advanced neural network model, originally applied in sleep apnea syndrome, to interpret ECG data for CHF, we aim to develop a method that is not only more efficient but also more accessible and objective, aligning with our broader interest in how technology can bridge gaps in healthcare accessibility.

What kind of problem is this? Classification? Regression? Structured prediction? Reinforcement Learning? Unsupervised Learning? etc.

The problem addressed in the paper is primarily a classification problem. Because the dataset of the paper is also private, we changed the input from PPG data of SAS patients into ECG data of CHF patients and the normal through the open source website name physionet. More specifically, this task is conducted within the framework of supervised learning, where the model is trained on a labeled dataset—labels in this case denote the presence or absence of CHF in the ECG segments. This approach ensures that the model learns from examples with known outcomes to make accurate predictions on new, unseen data.

Related Work

Are you aware of any, or is there any prior work that you drew on to do your project?

Our team includes two biomedical students and one neuroscience student, so we are familiar with sleep, physiological mechanism and neural events. More specifically, Zeming Liu, one of our group whose undergraduate thesis is around screening of congestive heart failure patients (https://www.mdpi.com/1099-4300/23/12/1669). In this scenario, the background is similar with the paper, both of them care to evaluate autonomic nervous system (ANS) function.

Please read and briefly summarize (no more than one paragraph) at least one paper/article/blog relevant to your topic beyond the paper you are re-implementing/novel idea you are researching

The paper name 'Automatic Detection of Congestive Heart Failure Based on Multiscale Residual UNet++: From Centralized Learning to Federated Learning', inspired by the outstanding performance of U-shaped networks in medical image segmentation. In this article, the authors propose a novel end-to-end classification model based on 2000 intervals between successive R-peaks of ECG signals. The proposed model integrates the outputs of encoders, decoders, and intermediate units through a unified scale operation, which can not only preserve low-level details from the input signals but also extract the high-level pathology-related information.

Data

The data of the paper is private dataset, so we cannot get the same data of it, we just know the format of the data. More specifically, the patients' data is seperated into 1min, and each segment has its corresponding label, 0 or 1, to represnt the presence (1) or absence (0) of sleep apnea events. Based on this, we use the open source website name physionet (\url{https://physionet.org/}). The 24-h RR interval signals of 54 healthy subjects (31 males and 23 females, aged 61.38 ± 11.63 years) were collected from the Normal Sinus Rhythms RR Interval database, and 44 congestive heart failure (CHF) subjects (19 males and 6 females, 19 subjects’ gender were unknown, aged 55.51 ± 11.44) were acquired from the Beth Israel Deaconess Medical Center (BIDMC) Congestive Heart Failure database (15 subjects) and the Congestive Heart Failure RR Interval database (29 subjects).

In order to make this dataset applicable to our network, we preprocessed the dataset. To eliminate singular value interference in the dataset, we removed the first and last RR intervals from each patient's 24-hour record and removed RR intervals greater than three seconds. In addition, in order to make our data conform to the network structure in the paper, we segmented the data according to the method in the paper, i.e., we segmented the original data into a number of 100-point segments and formed a complete batch of 25 segments. The final structure of the input data is (batch, 25, 100). In addition, since the original dataset has no labels, we added labels to the data ourselves, where the label for CHF patients is 1 and the label for normal people is 0. One label is used for one batch. Therefore, the final labeled data structure is (batch, 1). Finally, we mixed the batch of CHF patients and normal people together and divided all the data equally into a training set and a test set and used them for our network.

Methodology

The architecture is comprising of:

Multiscale feature extraction: Utilizes multiple convolutional scales to extract relevant features from the PPG signal at different temporal resolutions. Bidirectional temporal convolution: Integrates both forward and backward temporal data to enhance predictive accuracy of sleep apnea events. (Finally, we gave up this part due to time constraints)

Training the Model The model training incorporates several key strategies:

Regularized Dropout (RD): Aims to reduce overfitting by adjusting the dropout during the training phase. Logit Adjustment: Addresses class imbalance by modifying the logits for the minority class, enhancing model sensitivity. Loss Functions: Uses a combination of custom-tailored loss functions to handle class imbalance and improve model robustness.

Challenges in Implementation Implementing this model presents several challenges:

  1. Complex Feature Extraction: Balancing the multiscale extraction process without compromising the model’s efficiency.
  2. Temporal Dependencies: Effectively learning and utilizing the temporal dependencies in a noisy PPG signal environment.
  3. Class Imbalance: Managing the prevalence of non-apneic versus apneic segments and achieving high sensitivity without sacrificing specificity.
  4. Computational Resources: Handling the computational demands of training and deploying a sophisticated deep learning model.

Metrics for Success

The success of the model is quantified using the following metrics:

  1. Base Goal: Complete preprocessing and train a simple 1D Alexnet model for classification.
  2. Target Goal: Train the 1D-MsNet architecture design and use the 1D Alexnet as a standard of comparison.
  3. Stretch Goal: Train other architecture and attempt to beat the paper's classification accuracy, also comparing with non-linear methods.

Planned Experiments

Model Validation: Using data from 98 subjects split into training and testing sets. Performance Comparison: Benchmark against baseline methods such as 1D Alexnet. Cross-Validation: Implementing k-fold cross-validation to test robustness.

Appropriateness of Accuracy While accuracy is commonly used, in medical diagnostics like CHF detection, metrics such as sensitivity, specificity, and AUC are crucial to balance the trade-offs between missing true positives and incorrectly identifying negatives, making them more appropriate than accuracy alone.

Broader Societal Issues

The use of wearable devices for health monitoring, specifically for the detection of sleep apnea, intersects with several broader societal issues:

1.Healthcare Access: By enabling early and more accessible diagnosis of CHF, wearable technology can potentially reduce healthcare disparities. However, access to such technology is also unevenly distributed, which could perpetuate or even widen existing health disparities.

2.Data Privacy: The collection of health data through wearable devices raises significant privacy concerns. Ensuring the security and confidentiality of sensitive health data is crucial to maintain trust and protect individuals’ privacy.

Dataset Concerns and Representation The dataset used for training the CHF detection model raises several ethical considerations:

  1. Data Collection and Labeling: The integrity of the data collection process and the accuracy of labeling are critical. Mislabeling can lead to incorrect training of the model, affecting its reliability.

  2. Representativeness: If the dataset does not adequately represent the diverse populations affected by sleep apnea, the model may not perform equally well across different demographic groups, leading to biased outcomes.

  3. Societal Biases: There is a risk that the dataset may reflect historical or societal biases, such as underrepresentation of minority groups in clinical trials, which could skew the model’s predictions.

Division of Labor

Zeming Liu

Responsibilities:

  1. Data Prepocessing: Extract RR intervals from raw ECG data, and preprocess the data (removed the first and last RR intervals from each patient's 24-hour record and removed RR intervals greater than three seconds.)

  2. Implement the 1-D convolutional block of the model.

  3. Combine his previous paper of fApEn_IBS nonlinear model and compare the accuracy of deep learning model and previous non-linear model of screening the CHF patients.

Yujin Chung

Responsibilities:

  1. Double check the data and ensure the data is correctly formatted and ready for use in the model.

  2. Implement the multiscale feature extraction aspects of the model.

Jinqian Li Responsibilities:

  1. Make the data conform to the network structure in the paper, segmented the data according to the method in the paper (segmented the original data into a number of 100-point segments and formed a complete batch of 25 segments).

  2. Implement the Squeeze-and-excitation block of the model.

Built With

Share this project:

Updates