Site-Aware Federated Learning via Embedding and Resampling with Electrocardiograms

Modern machine learning (ML) methods perform remarkably across a number of diagnostic tasks. Despite this performance, the integration of ML methods in healthcare is relatively limited. While there are a variety of reasons for this, it is notable that most approaches ignore additional constraints that must be made in the healthcare setting. In particular, there may be a relative paucity of data from any single institution; therefore, collaboration is necessary in order to amass a dataset suitable for ML. Furthermore, data may be heterogeneous, with different labels and different input dimensions. Finally, respecting patient privacy is paramount. In this study, we train a classifier under the assumptions of (1) data distributed across multiple institutions, (2) highly heterogeneous data, and (3) a requirement for patient privacy. We enable site-awareness using a global average pooling module to capture high-level information about electrocardiogram (ECG) recording methods combined with a ResNet to encode specific features in ECGs, and we demonstrate that the proposed site-aware ResNet (SA-ResNet) outperforms other state-of-the-art approaches in cardiovascular disease diagnosis. On a highly heterogeneous dataset constructed from three independent datasets distributed unevenly across seven institutions, the proposed model achieves an accuracy, precision, recall, and F1 score of 76.3%, 69.5%, 76.8%, and 73.0%, respectively. Despite the impressive results ML methods achieve across a wide variety of healthcare domains, there are several barriers to its implementation in the healthcare domain. When an ML model is trained, several idealized assumptions are made that may not reflect the actual environment in which a model deployed in a healthcare setting is trained. The first assumption is that of data uniformity. Typically, a (supervised) model is trained on a dataset D = { ( X i , y i ) } i = 1 N ⊂ R n ୍ଠ m ୍ଠ R L , where X i ∈ R n ୍ଠ m is a feature tensor and y i ∈ R L is a corresponding label. For instance, in the setting of electrocardiogram (ECG) interpretation, X i ∈ R 12 ୍ଠ 3000 might be a recording of a six-second, 12-lead ECG obtained from a patient at a sampling rate of 500 Hertz. The label y i ∈ R L would then correspond to a diagnosis typically made by a cardiologist out of a possible L diagnoses. However, in order to train a model capable of using all recording ECG data from even a single institution, it may be necessary to handle input corresponding to variable leads, sampling rates, and duration of recording. Another common assumption is that all data is available in a single repository. However, in order to achieve better generalizability, it is desirable to have access to a greater amount of data [ 10]. ML models are typically trained with this assumption, since publicly-available datasets are often amalgamated and made available in repositories, such as Physionet [ 11]. Nevertheless, depending on the institution size and type of test, to create a comparable dataset might require data from many institutions. Finally, the training of ML models does not typically respect patient privacy. While publicly-available datasets are convenient with respect to developing prototypical models, in order to make use of data from healthcare institutions, it is imperative that patient privacy be respected. The right to privacy is a key aspect of the doctor–patient relationship, and therefore respecting patient privacy during ML model training is both an ethical and legal issue [ 12, 13]. The lack of patient privacy in training ML models is viewed as a risk that must be mitigated before ML models can be deployed in a healthcare setting [ 14, 15]. In addition, failures to respect patient privacy can have negative outcomes for patients—it has been shown that privacy violations can negative impacts on patient treatment and diagnosis [ 16]. Therefore, the importance of ML respecting patient privacy is not only crucial from an ethicolegal perspective, but also with respect to actual patient health outcomes. gives a graphical overview of these assumptions. With the aim of reconciling the training of ML models with the healthcare settings in which they might be deployed, cardiology stands out as a promising domain for a variety of reasons. An ECG is a commonly ordered test, with over 100 million ECGs recorded annually in the United States [ 17]. An ECG uses electrodes to measure depolarizations of the heart. These depolarizations occur cyclically, and vary predictably in different pathologies. For instance, Mobitz type II heart block typically presents with a regular PR interval, but QRS complexes are dropped in a regular pattern [ 18]. Because of the large volume of ECGs that are available with corresponding diagnoses, and their regular variations, ML is a promising approach. Propose a site-aware model for automated ECG diagnosis, combining a ResNet with global average pooling to achieve encoding of information at the level of each ECG (using the ResNet) and each institution (using global average pooling); Drop idealized assumptions about the use of machine learning in healthcare, ensuring a model training process that generalizes over arbitrary institutions and highly heterogeneous data, and respects patient privacy; Propose a general method of embedding data into higher dimensions such that it can be used to train a model and discuss ways of reducing dimensions (specific to ECGs) to enable faster model training; Improve over previous state-of-the-art methods. The rest of the paper is structured as follows. In Section 2, we give an overview of recent works in automated ECG diagnosis, focusing both on studies that do not focus on data heterogeneity, patient privacy, or multi-institution data, as well as those that do (or those that focus on a subset). Subsequently, in Section 3, we give an overview of the construction of our dataset and how this dataset can be embedded into higher-dimensional space and resampled to allow for more efficient model training, and we discuss the proposed model. In Section 4, we present the results and compare them to other state-of-the-art approaches, and discuss implications. Finally, in Section 5, we conclude by framing the problem, potential approaches, and possible avenues for future work. 2. Related Works A large variety of cardiac pathologies can be diagnosed using machine learning. For instance, in [ 27], the authors used an algorithmic approach based on noise filtering and a rule-based classifier to identify pathologies presenting with abnormalities in P waves, QRS complexes (including delta waves), and T waves. In [ 28], the authors used a convolutional neural network (CNN) to detect ECG wave anomalies via a wearable device, but it was limited by the inability to differentiate between different types of cardiovascular diseases. Also, in [ 20], the authors used a CNN-based classifier to identify three pathological beats (ventricular, supraventricular, and fusion beats) using one lead ECG dataset. Furthermore, in [ 29], a lightweight CNN architecture was used to predict history and current myocardial infarction along with abnormal heartbeats with an accuracy of 98.23%. Additionally, in [ 30], the authors used a binary CNN-based algorithm to detect ventricular and non-ventricular ectopic beats using one lead ECG dataset. In [ 31], a deep dense neural network (DNN) was used to as a model and was shown to diagnose ECGs with high accuracy. Notably, in [ 31], the authors used patient ECG data from over three hundred hospitals along with ECGs from wearable devices. In [ 32], the authors used four neural network techniques to detect left ventricular hypertrophy with the highest accuracy (97.8%), achieved by a scaled conjugate gradient backpropagation neural network (SCG NN), a Levenberg–Marquardt neural network (LMNN). In [ 33], the authors used machine learning classifiers to perform automatic arrhythmia classification. The types of classifiers included support vector machines (SVMs), k-nearest neighbors (kNNs), gradient boosted decision trees (GBDTs), and random forests (RFs). In [ 34], the authors used the Fourier decomposition method to detect atrial fibrillation using two datasets (MIT-BIH atrial fibrillation and arrhythmia). In [ 35], the authors used an ensembled support vector machine (SVM) classifier to classify heartbeat with higher accuracy than single SVM, kNNs, RF, and long short-term memory units (LSTMs). In [ 36], the authors used a random forest classifier to predict cardiovascular abnormalities in the 2 s and 5 s duration of the lead two ECGs using two datasets. highlights previous studies using machine learning to make automated ECG diagnosis. Notably, all this research uses centralized data. Most of the literature uses one dataset, with one study using three datasets. Despite the high accuracy of automated ECG diagnosis, this may not be clinically applicable as clinical data comprises distributed data. When using distributed data, the privacy of the data poses a challenge to the accessibility of the data to train machine learning models. Furthermore, the assumption of uniform data distribution across various sites may not reflect reality. However, some recent research has addressed these concerns, varying in the degree to which they address them. In addition to the recent studies in using machine learning on ECGs to diagnose cardiovascular abnormalities, some other studies focus on using federated learning to make ECG diagnoses. In [ 19], an asynchronous CNN-based lightweight model was used to make an arrhythmia diagnosis in the context of federated learning. In [ 20], a CNN-based autoencoder and a classifier were used to denoise and classify raw data of ECG time series using transfer learning. In [ 21], one classifier per class-based federated learning was used along with feature extraction using a one-dimension convolutional layer. In [ 22], the federated learning approach was utilized by training distributed data from multiple medical centers without data sharing. In [ 23], 12 heterogeneously distributed lead ECG datasets from different centers were used to train AI models using federated learning. In [ 24], transfer learning strategies were used to predict myocardial injury based on one lead ECG which was pretrained on a 12-lead ECG. In [ 25], non-contact sensors were used to augment the privacy in federated learning. In [ 26], an auto-encoder was used to enable federated learning on data with disparate dimensions. gives an overview of these studies. Notably, the majority of studies that use federated learning assume relatively similar data across all clients. Of all studies in , only one uses datasets with different sampling rates, numbers of leads, and different classes across clients, while still training a model that protects patient privacy. We improve on this study by proposing a model that outperforms previous approaches, while still dealing with different leads, classes, and sampling rates over clients. 4. Results and Discussion 4.1. Modeling Results gives an overview of the results. Previous approaches made use of autoencoders with various encoding dimensions [ 26]; for easier comparison, we have taken the best results for each tested algorithm over all encoding dimensions. We note that the proposed approach achieves an improvement in each tested category, with a significantly better accuracy and recall than the previous best method. We used t-tests with a level of significance of α = 0.05 for the proposed model and the next most performant model. The differences in accuracy, recall, and F1 score were statistically significant ( p 90%, and at times <99%)—which are potentially overly optimistic given idealized assumptions. Broadly, we may frame the current approaches to this modeling problem as two sub-problems. The first is the method in which data is transformed such that it can be used in conjunction with federated learning to train a model. While we proposed embedding and resampling, other approaches have used autoencoders. Therefore, future work might investigate other methods of data transformation to arrive at a common input dimension. The second sub-problem is the modeling problem. While the current study uses ResNets and average pooling for site-awareness, there are a multitude of architectures that could be evaluated. Overall, we note that although ML approaches have achieved very impressive results on a number of tasks in the healthcare domain, current models may not be readily deployable due to a number of additional constraints that have not been considered. Thus, we hope that the current study will motivate more work in machine learning that could be better applied in the healthcare domain. Author Contributions W.C.: Conceptualization, Methodology, Software, Formal Analysis, Investigation, Visualization, Data Curation, Writing—First Draft, Writing—Review and Editing, S.H.L.: Data Curation, Writing—First Draft, Writing—Review and Editing. H.W.: Supervision, Writing—Review and Editing. All authors have read and agreed to the published version of the manuscript. Funding This research received no external funding. Institutional Review Board Statement This study used only existing, publicly available, de-identified data hosted on PhysioNet. The database is fully anonymized and complies with the Health Insurance Portability and Accountability Act (HIPAA). Therefore, institutional review board (IRB) approval and informed consent were not required for this analysis. Access to the data was granted after completion of the required data use agreement and training as stipulated by PhysioNet. Informed Consent Statement Not applicable. Data Availability Statement The data used in this report is freely available via PhysioNet. Conflicts of Interest The authors declare no conflicts of interest. References Markit, I. The Complexities of Physician Supply and Demand: Projections from 2015 to 2030; Association of American Medical Colleges: Washington, DC, USA, 2017. [ Google Scholar] Zheng, Q.; Yang, L.; Zeng, B.; Li, J.; Guo, K.; Liang, Y.; Liao, G. Artificial intelligence performance in detecting tumor metastasis from medical radiology imaging: A systematic review and meta-analysis. EClinicalMedicine 2021, 31, 100669. [ Google Scholar] [ CrossRef] [ PubMed] Yacoub, B.; Kabakus, I.M.; Schoepf, U.J.; Giovagnoli, V.M.; Fischer, A.M.; Wichmann, J.L.; Martinez, J.D.; Sharma, P.; Rapaka, S.; Sahbaee, P.; et al. Performance of an artificial intelligence-based platform against clinical radiology reports for the evaluation of noncontrast chest CT. Acad. Radiol. 2022, 29, S108–S117. [ Google Scholar] [ CrossRef] Bai, H.X.; Wang, R.; Xiong, Z.; Hsieh, B.; Chang, K.; Halsey, K.; Tran, T.M.L.; Choi, J.W.; Wang, D.C.; Shi, L.B.; et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology 2020, 296, E156–E165. [ Google Scholar] [ CrossRef] Lopez-Jimenez, F.; Attia, Z.; Arruda-Olson, A.M.; Carter, R.; Chareonthaitawee, P.; Jouni, H.; Kapa, S.; Lerman, A.; Luong, C.; Medina-Inojosa, J.R.; et al. Artificial intelligence in cardiology: Present and future. In Proceedings of the Mayo Clinic Proceedings; Elsevier: Amsterdam, The Netherlands, 2020; Volume 95, pp. 1015–1039. [ Google Scholar] Chorney, W.; Wang, H.; Fan, L.W. AttentionCovidNet: Efficient ECG-based diagnosis of COVID-19. Comput. Biol. Med. 2024, 168, 107743. [ Google Scholar] [ CrossRef] Chorney, W.; Wang, H.; He, L.; Lee, S.; Fan, L.W. Convolutional block attention autoencoder for denoising electrocardiograms. Biomed. Signal Process. Control 2023, 86, 105242. [ Google Scholar] Taylor, R.A.; Chmura, C.; Hinson, J.; Steinhart, B.; Sangal, R.; Venkatesh, A.K.; Xu, H.; Cohen, I.; Faustino, I.V.; Levin, S. Impact of Artificial Intelligence–Based Triage Decision Support on Emergency Department Care. NEJM AI 2025, 2, AIoa2400296. [ Google Scholar] [ CrossRef] Baker, A.; Perov, Y.; Middleton, K.; Baxter, J.; Mullarkey, D.; Sangar, D.; Butt, M.; DoRosario, A.; Johri, S. A comparison of artificial intelligence and human doctors for the purpose of triage and diagnosis. Front. Artif. Intell. 2020, 3, 543405. [ Google Scholar] [ CrossRef] Cho, J.; Lee, K.; Shin, E.; Choy, G.; Do, S. How much data is needed to train a medical image deep learning system to achieve necessary high accuracy? arXiv 2015, arXiv:1511.06348. [ Google Scholar] Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.K.; Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation 2000, 101, e215–e220. [ Google Scholar] Gerke, S.; Minssen, T.; Cohen, G. Ethical and legal challenges of artificial intelligence-driven healthcare. In Artificial Intelligence in Healthcare; Elsevier: Amsterdam, The Netherlands, 2020; pp. 295–336. [ Google Scholar] Dalton-Brown, S. The ethics of medical AI and the physician-patient relationship. Camb. Q. Healthc. Ethics 2020, 29, 115–121. [ Google Scholar] [ CrossRef] [ PubMed] Bartoletti, I. AI in healthcare: Ethical and privacy challenges. In Proceedings of the Artificial Intelligence in Medicine: 17th Conference on Artificial Intelligence in Medicine, AIME 2019, Poznan, Poland, 26–29 June 2019, Proceedings 17; Springer: Berlin/Heidelberg, Germany, 2019; pp. 7–10. [ Google Scholar] Kumar, P.; Dwivedi, Y.K.; Anand, A. Responsible artificial intelligence (AI) for value formation and market performance in healthcare: The mediating role of patient’s cognitive engagement. Inf. Syst. Front. 2023, 25, 2197–2220. [ Google Scholar] [ CrossRef] Newaz, A.I.; Sikder, A.K.; Rahman, M.A.; Uluagac, A.S. A survey on security and privacy issues in modern healthcare systems: Attacks and defenses. ACM Trans. Comput. Healthc. 2021, 2, 27. [ Google Scholar] [ CrossRef] Smulyan, H. The computerized ECG: Friend and foe. Am. J. Med. 2019, 132, 153–160. [ Google Scholar] [ CrossRef] [ PubMed] Barold, S.S.; Herweg, B. Mobitz type II second-degree atrioventricular block: A commonly overdiagnosed and misinterpreted arrhythmia. Front. Cardiovasc. Med. 2024, 11, 1450705. [ Google Scholar] [ CrossRef] [ PubMed] Sakib, S.; Fouda, M.M.; Fadlullah, Z.M.; Abualsaud, K.; Yaacoub, E.; Guizani, M. Asynchronous federated learning-based ECG analysis for arrhythmia detection. In Proceedings of the 2021 IEEE International Mediterranean Conference on Communications and Networking (MeditCom); IEEE: Piscataway, NJ, USA, 2021; pp. 277–282. [ Google Scholar] Raza, A.; Tran, K.P.; Koehl, L.; Li, S. Designing ECG monitoring healthcare system with federated transfer learning and explainable AI. Knowl. Based Syst. 2022, 236, 107763. [ Google Scholar] [ CrossRef] Sun, L.; Wu, J. A scalable and transferable federated learning system for classifying healthcare sensor data. IEEE J. Biomed. Health Inform. 2022, 27, 866–877. [ Google Scholar] [ CrossRef] Goto, S.; Solanki, D.; John, J.E.; Yagi, R.; Homilius, M.; Ichihara, G.; Katsumata, Y.; Gaggin, H.K.; Itabashi, Y.; MacRae, C.A.; et al. Multinational federated learning approach to train ECG and echocardiogram models for hypertrophic cardiomyopathy detection. Circulation 2022, 146, 755–769. [ Google Scholar] [ CrossRef] Jimenez Gutierrez, D.M.; Hassan, H.M.; Landi, L.; Vitaletti, A.; Chatzigiannakis, I. Application of federated learning techniques for arrhythmia classification using 12-lead ecg signals. In Proceedings of the International Symposium on Algorithmic Aspects of Cloud Computing; Springer: Berlin/Heidelberg, Germany, 2023; pp. 38–65. [ Google Scholar] Jin, B.T.; Palleti, R.; Shi, S.; Ng, A.Y.; Quinn, J.V.; Rajpurkar, P.; Kim, D. Transfer learning enables prediction of myocardial injury from continuous single-lead electrocardiography. J. Am. Med. Inform. Assoc. 2022, 29, 1908–1918. [ Google Scholar] [ CrossRef] Hwang, T.H.; Shi, J.; Lee, K. Enhancing Privacy-Preserving Personal Identification Through Federated Learning With Multimodal Vital Signs Data. IEEE Access 2023, 11, 121556–121566. [ Google Scholar] [ CrossRef] Chorney, W.; Wang, H. Towards federated transfer learning in electrocardiogram signal analysis. Comput. Biol. Med. 2024, 170, 107984. [ Google Scholar] [ CrossRef] Monedero, I. A novel ECG diagnostic system for the detection of 13 different diseases. Eng. Appl. Artif. Intell. 2022, 107, 104536. [ Google Scholar] [ CrossRef] Alimbayeva, Z.; Alimbayev, C.; Ozhikenov, K.; Bayanbay, N.; Ozhikenova, A. Wearable ECG Device and Machine Learning for Heart Monitoring. Sensors 2024, 24, 4201. [ Google Scholar] [ CrossRef] Abubaker, M.B.; Babayiğit, B. Detection of cardiovascular diseases in ECG images using machine learning and deep learning methods. IEEE Trans. Artif. Intell. 2022, 4, 373–382. [ Google Scholar] [ CrossRef] Wong, D.L.T.; Li, Y.; John, D.; Ho, W.K.; Heng, C.H. An energy efficient ECG ventricular ectopic beat classifier using binarized CNN for edge AI devices. IEEE Trans. Biomed. Circuits Syst. 2022, 16, 222–232. [ Google Scholar] [ CrossRef] Cai, W.; Chen, Y.; Guo, J.; Han, B.; Shi, Y.; Ji, L.; Wang, J.; Zhang, G.; Luo, J. Accurate detection of atrial fibrillation from 12-lead ECG using deep neural network. Comput. Biol. Med. 2020, 116, 103378. [ Google Scholar] [ CrossRef] [ PubMed] Jothiramalingam, R.; Jude, A.; Patan, R.; Ramachandran, M.; Duraisamy, J.H.; Gandomi, A.H. Machine learning-based left ventricular hypertrophy detection using multi-lead ECG signal. Neural Comput. Appl. 2021, 33, 4445–4455. [ Google Scholar] [ CrossRef] Hassaballah, M.; Wazery, Y.M.; Ibrahim, I.E.; Farag, A. Ecg heartbeat classification using machine learning and metaheuristic optimization for smart healthcare systems. Bioengineering 2023, 10, 429. [ Google Scholar] [ CrossRef] Udawat, A.S.; Singh, P. An automated detection of atrial fibrillation from single-lead ECG using HRV features and machine learning. J. Electrocardiol. 2022, 75, 70–81. [ Google Scholar] [ PubMed] Pandey, S.K.; Janghel, R.R.; Vani, V. Patient specific machine learning models for ECG signal classification. Procedia Comput. Sci. 2020, 167, 2181–2190. [ Google Scholar] [ CrossRef] Pham, T.H.; Sree, V.; Mapes, J.; Dua, S.; Lih, O.S.; Koh, J.E.; Ciaccio, E.J.; Acharya, U.R. A novel machine learning framework for automated detection of arrhythmias in ECG segments. J. Ambient Intell. Humaniz. Comput. 2021, 12, 10145–10162. [ Google Scholar] [ CrossRef] LaMori, J.C.; Mody, S.H.; Gross, H.J.; daCosta DiBonaventura, M.; Patel, A.A.; Schein, J.R.; Nelson, W.W. Burden of comorbidities among patients with atrial fibrillation. Ther. Adv. Cardiovasc. Dis. 2013, 7, 53–62. [ Google Scholar] [ CrossRef] Zhu, H.; Cheng, C.; Yin, H.; Li, X.; Zuo, P.; Ding, J.; Lin, F.; Wang, J.; Zhou, B.; Li, Y.; et al. Automatic multilabel electrocardiogram diagnosis of heart rhythm or conduction abnormalities with deep learning: A cohort study. Lancet Digit. Health 2020, 2, e348–e357. [ Google Scholar] [ CrossRef] [ PubMed] Cohen-Shelly, M.; Attia, Z.I.; Friedman, P.A.; Ito, S.; Essayagh, B.A.; Ko, W.Y.; Murphree, D.H.; Michelena, H.I.; Enriquez-Sarano, M.; Carter, R.E.; et al. Electrocardiogram screening for aortic valve stenosis using artificial intelligence. Eur. Heart J. 2021, 42, 2885–2896. [ Google Scholar] [ CrossRef] Kwon, J.m.; Jo, Y.Y.; Lee, S.Y.; Kang, S.; Lim, S.Y.; Lee, M.S.; Kim, K.H. Artificial intelligence-enhanced smartwatch ECG for heart failure-reduced ejection fraction detection by generating 12-lead ECG. Diagnostics 2022, 12, 654. [ Google Scholar] Akbilgic, O.; Butler, L.; Karabayir, I.; Chang, P.P.; Kitzman, D.W.; Alonso, A.; Chen, L.Y.; Soliman, E.Z. ECG-AI: Electrocardiographic artificial intelligence model for prediction of heart failure. Eur. Heart J. Digit. Health 2021, 2, 626–634. [ Google Scholar] [ CrossRef] Al Younis, S.M.; Hadjileontiadis, L.J.; Khandoker, A.H.; Stefanini, C.; Soulaidopoulos, S.; Arsenos, P.; Doundoulakis, I.; Gatzoulis, K.A.; Tsioufis, K. Prediction of heart failure patients with distinct left ventricular ejection fraction levels using circadian ECG features and machine learning. PLoS ONE 2024, 19, e0302639. [ Google Scholar] [ CrossRef] Butler, J.; Anker, S.D.; Packer, M. Redefining heart failure with a reduced ejection fraction. JAMA 2019, 322, 1761–1762. [ Google Scholar] Adedinsewo, D.; Carter, R.E.; Attia, Z.; Johnson, P.; Kashou, A.H.; Dugan, J.L.; Albus, M.; Sheele, J.M.; Bellolio, F.; Friedman, P.A.; et al. Artificial intelligence-enabled ECG algorithm to identify patients with left ventricular systolic dysfunction presenting to the emergency department with dyspnea. Circ. Arrhythmia Electrophysiol. 2020, 13, e008437. [ Google Scholar] [ CrossRef] Kamel, H.; Okin, P.M.; Elkind, M.S.; Iadecola, C. Atrial fibrillation and mechanisms of stroke: Time for a new model. Stroke 2016, 47, 895–900. [ Google Scholar] [ CrossRef] [ PubMed] Khurshid, S.; Friedman, S.; Reeder, C.; Di Achille, P.; Diamant, N.; Singh, P.; Harrington, L.X.; Wang, X.; Al-Alusi, M.A.; Sarma, G.; et al. ECG-based deep learning and clinical risk factors to predict atrial fibrillation. Circulation 2022, 145, 122–133. [ Google Scholar] [ PubMed] Srinivasan, N.T.; Schilling, R.J. Sudden cardiac death and arrhythmias. Arrhythmia Electrophysiol. Rev. 2018, 7, 111. [ Google Scholar] [ CrossRef] Centeno-Bautista, M.A.; Perez-Sanchez, A.V.; Amezquita-Sanchez, J.P.; Valtierra-Rodriguez, M. Sudden cardiac death prediction based on the complete ensemble empirical mode decomposition method and a machine learning strategy by using ECG signals. Measurement 2024, 236, 115052. [ Google Scholar] [ CrossRef] Chorney, W.; Ling, S.H. Federated Learning Strategies for Atrial Fibrillation Detection. J. Exp. Theor. Anal. 2025, 3, 23. [ Google Scholar] [ CrossRef] Moody, G.B.; Mark, R.G. The impact of the MIT-BIH arrhythmia database. IEEE Eng. Med. Biol. Mag. 2001, 20, 45–50. [ Google Scholar] [ CrossRef] Clifford, G.D.; Liu, C.; Moody, B.; Li-wei, H.L.; Silva, I.; Li, Q.; Johnson, A.; Mark, R.G. AF classification from a short single lead ECG recording: The PhysioNet/computing in cardiology challenge 2017. In Proceedings of the 2017 Computing in Cardiology (CinC); IEEE: Piscataway, NJ, USA, 2017; pp. 1–4. [ Google Scholar] Wagner, P.; Strodthoff, N.; Bousseljot, R.D.; Kreiseler, D.; Lunze, F.I.; Samek, W.; Schaeffter, T. PTB-XL, a large publicly available electrocardiography dataset. Sci. Data 2020, 7, 154. [ Google Scholar] [ CrossRef] [ PubMed] Ye, M.; Chen, Y.; Huang, W.; Cai, H.; Cui, L. Towards Generalization Fairness in Federated Learning. IEEE Trans. Mob. Comput. 2025, 25, 2546–2560. [ Google Scholar] [ CrossRef] Rijnbeek, P.R.; Kors, J.A.; Witsenburg, M. Minimum bandwidth requirements for recording of pediatric electrocardiograms. Circulation 2001, 104, 3087–3090. [ Google Scholar] [ CrossRef] Kwon, O.; Jeong, J.; Kim, H.B.; Kwon, I.H.; Park, S.Y.; Kim, J.E.; Choi, Y. Electrocardiogram sampling frequency range acceptable for heart rate variability analysis. Healthc. Inform. Res. 2018, 24, 198–206. [ Google Scholar] [ CrossRef] Creasy, S.; Alexeenko, V.; Lip, G.Y.; Tse, G.; Aston, P.J.; Jeevaratnam, K. Electrocardiogram sampling frequency for the optimal performance of complexity analysis and machine learning models: Discrimination between patients with and without paroxysmal atrial fibrillation using sinus rhythm electrocardiograms. Heart Rhythm O2 2025, 6, 48–57. [ Google Scholar] [ CrossRef] Baumert, M.; Schmidt, M.; Zaunseder, S.; Porta, A. Effects of ECG sampling rate on QT interval variability measurement. Biomed. Signal Process. Control 2016, 25, 159–164. [ Google Scholar] [ CrossRef] Cosio, F.G. Atrial flutter, typical and atypical: A review. Arrhythmia Electrophysiol. Rev. 2017, 6, 55. [ Google Scholar] [ CrossRef] He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2016; pp. 770–778. [ Google Scholar] Jing, E.; Zhang, H.; Li, Z.; Liu, Y.; Ji, Z.; Ganchev, I. ECG heartbeat classification based on an improved ResNet-18 model. Comput. Math. Methods Med. 2021, 2021, 6649970. [ Google Scholar] [ CrossRef] Han, C.; Shi, L. ML–ResNet: A novel network to detect and locate myocardial infarction using 12 leads ECG. Comput. Methods Programs Biomed. 2020, 185, 105138. [ Google Scholar] Ghosh, A.; Chung, J.; Yin, D.; Ramchandran, K. An efficient framework for clustered federated learning. Adv. Neural Inf. Process. Syst. 2020, 33, 19586–19597. [ Google Scholar] Chorney, W.W. Vertical Federated Learning Using Autoencoders with Applications in Electrocardiograms; Mississippi State University: Mississippi State, MS, USA, 2023. [ Google Scholar] Agrawal, S.; Sarkar, S.; Alazab, M.; Maddikunta, P.K.R.; Gadekallu, T.R.; Pham, Q.V. Genetic CFL: Hyperparameter optimization in clustered federated learning. Comput. Intell. Neurosci. 2021, 2021, 7156420. [ Google Scholar] [ CrossRef] Zhao, Z. Transforming ECG diagnosis: An in-depth review of transformer-based deeplearning models in cardiovascular disease detection. arXiv 2023, arXiv:2306.01249. [ Google Scholar] Beutel, D.J.; Topal, T.; Mathur, A.; Qiu, X.; Fernandez-Marques, J.; Gao, Y.; Sani, L.; Li, K.H.; Parcollet, T.; de Gusmão, P.P.B.; et al. Flower: A friendly federated learning research framework. arXiv 2020, arXiv:2007.14390. [ Google Scholar] Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [ Google Scholar] Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision; IEEE: Piscataway, NJ, USA, 2017; pp. 618–626. [ Google Scholar] Adasuriya, G.; Haldar, S. Next generation ECG: The impact of artificial intelligence and machine learning. Curr. Cardiovasc. Risk Rep. 2023, 17, 143–154. [ Google Scholar] [ CrossRef] Nariman, G.S.; Hamarashid, H.K. Communication overhead reduction in federated learning: A review. Int. J. Data Sci. Anal. 2025, 19, 185–216. [ Google Scholar] [ CrossRef] Goffredo, G.; Correale, M.; Manfredi, D.; La Cecilia, G.; Ruggiero, A.; Ieva, R.; Brunetti, N.D. Acute coronary syndrome with single-lead ST segment elevation: A rare presentation of multivessel coronary artery disease. J. Electrocardiol. 2024, 82, 80–82. [ Google Scholar] [ CrossRef] Winter, E. The shapley value. Handb. Game Theory Econ. Appl. 2002, 3, 2025–2054. [ Google Scholar] Singh, A.R.; Singh, S.A. Diseases of poverty and lifestyle, well-being and human development. Mens Sana Monogr. 2008, 6, 187. [ Google Scholar] [ CrossRef] Quinn, T.P.; Jacobs, S.; Senadeera, M.; Le, V.; Coghlan, S. The three ghosts of medical AI: Can the black-box present deliver? Artif. Intell. Med. 2022, 124, 102158. [ Google Scholar] [ CrossRef] Li, T.; Sahu, A.K.; Zaheer, M.; Sanjabi, M.; Talwalkar, A.; Smith, V. Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2020, 2, 429–450. [ Google Scholar] Wang, J.; Liu, Q.; Liang, H.; Joshi, G.; Poor, H.V. Tackling the objective inconsistency problem in heterogeneous federated optimization. Adv. Neural Inf. Process. Syst. 2020, 33, 7611–7623. [ Google Scholar] Karimireddy, S.P.; Kale, S.; Mohri, M.; Reddi, S.; Stich, S.; Suresh, A.T. Scaffold: Stochastic controlled averaging for federated learning. In Proceedings of the International Conference on Machine Learning; PMLR: Cambridge, MA, USA, 2020; pp. 5132–5143. [ Google Scholar] Li, Q.; He, B.; Song, D. Model-contrastive federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2021; pp. 10713–10722. [ Google Scholar] Li, X.; Huang, K.; Yang, W.; Wang, S.; Zhang, Z. On the convergence of fedavg on non-iid data. arXiv 2019, arXiv:1907.02189. [ Google Scholar] Wang, C.; Deng, C.; Wang, S. Imbalance-XGBoost: Leveraging weighted and focal losses for binary label-imbalanced classification with XGBoost. Pattern Recognit. Lett. 2020, 136, 190–197. [ Google Scholar] [ CrossRef] Li, Y.; Chang, T.H.; Chi, C.Y. Secure federated averaging algorithm with differential privacy. In Proceedings of the 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP); IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [ Google Scholar] Fazli Khojir, H.; Alhadidi, D.; Rouhani, S.; Mohammed, N. FedShare: Secure aggregation based on additive secret sharing in federated learning. In Proceedings of the 27th International Database Engineered Applications Symposium; Association for Computing Machinery: New York, NY, USA, 2023; pp. 25–33. [ Google Scholar] Tang, Y.; Wang, K. FPPFL: FedAVG-based privacy-preserving federated learning. In Proceedings of the 2023 15th International Conference on Computer Modeling and Simulation; Association for Computing Machinery: New York, NY, USA, 2023; pp. 51–56. [ Google Scholar] The contrast between commonly made assumptions in ML in healthcare versus realistic constraints. The contrast between commonly made assumptions in ML in healthcare versus realistic constraints. An overview of the method used to ensure ML methods can be used across disparate client data. An overview of the method used to ensure ML methods can be used across disparate client data. A schematic representing the ResNet blocks used in the proposed model. A schematic representing the ResNet blocks used in the proposed model. The SA-ResNet model. A ResNet model with four blocks, as well as an average pooling across length dimensions that efficiently differentiates different sites, is used. The SA-ResNet model. A ResNet model with four blocks, as well as an average pooling across length dimensions that efficiently differentiates different sites, is used. Confidence intervals for ( a) accuracy, ( b) precision, ( c) recall, and ( d) F1 score for the proposed model versus other approaches. Confidence intervals for ( a) accuracy, ( b) precision, ( c) recall, and ( d) F1 score for the proposed model versus other approaches. t-SNE for the model, indicating the latent space representation of the trained model. t-SNE for the model, indicating the latent space representation of the trained model. One-dimensional Grad-CAM for a client-side ECG. One-dimensional Grad-CAM for a client-side ECG. Recent studies in automated ECG diagnosis. Recent studies in automated ECG diagnosis. Year Objective Method Dataset ECG Lead Diseases Accuracy Strength Weakness Reference Recent studies using federated learning to train ECG classifiers. Recent studies using federated learning to train ECG classifiers. Year Privacy Multiple Datasets Different Numbers of Leads Different Sampling Rates Different Classes Reference 2021 ✓ ✓ ✗ ✗ ✗ [ 19] 2022 ✓ ✗ ✗ ✗ ✗ [ 20] 2022 ✓ ✓ ✗ ✓ ✗ [ 21] 2022 ✓ ✓ ✗ ✗ ✗ [ 22] 2022 ✓ ✗ ✗ ✓ ✗ [ 23] 2022 ✓ ✓ ✓ ✗ ✗ [ 24] 2023 ✓ ✗ ✗ ✗ ✗ [ 25] 2024 ✓ ✓ ✓ ✓ ✓ [ 26] 2025 ✓ ✓ ✗ ✗ ✓ [ 49] 2026 ✓ ✓ ✓ ✓ ✓ This Study Overview of the distribution of data between clients. Overview of the distribution of data between clients. Client Train Samples Test Samples Leads Pathologies Sampling Rate mitbih_00000 530 221 2 Bundle Branch Block Normal 100 mitbih_00001 626 261 2 Bundle Branch Block Other 250 mitbih_00002 3253 1356 2 Normal Premature Contraction 360 cinc_00000 1932 793 1 Atrial Fibrillation Normal Other 250 cinc_00001 1855 754 1 Noisy Normal Other 360 ptbxl_00000 7742 3227 12 Normal Unhealthy 100 ptbxl_00001 2738 1142 12 Normal Unhealthy 500 Results of the proposed method, compared with the best results obtained by other methods. Results are reported as means ± standard deviations, with the best performance across algorithms bolded. Results of the proposed method, compared with the best results obtained by other methods. Results are reported as means ± standard deviations, with the best performance across algorithms bolded. Algorithm Accuracy Precision Recall F1 Score FedAvg 71.5 ବ୍ଦ 0.1 0.581 ବ୍ଦ 0.000 0.715 ବ୍ଦ 0.001 0.641 ବ୍ଦ 0.001 Genetic CFL 51.5 ବ୍ଦ 2.4 0.314 ବ୍ଦ 0.099 0.515 ବ୍ଦ 0.024 0.390 ବ୍ଦ 0.067 FedCHT 73.8 ବ୍ଦ 0.1 0.683 ବ୍ଦ 0.017 0.738 ବ୍ଦ 0.001 0.709 ବ୍ଦ 0.009 SA-ResNet 76 . 3 ବ୍ଦ 0 . 1 0 . 695 ବ୍ଦ 0 . 026 0 . 768 ବ୍ଦ 0 . 001 0 . 730 ବ୍ଦ 0 . 015 Ablation study comparing SA-ResNet to both a ResNet and a global average pooling module with a multilayer perceptron. The best performance across models is bolded. Ablation study comparing SA-ResNet to both a ResNet and a global average pooling module with a multilayer perceptron. The best performance across models is bolded. Model Accuracy Precision Recall F1 Score Site-Aware Branch 53.1 ବ୍ଦ 0.2 0.363 ବ୍ଦ 0.044 0.531 ବ୍ଦ 0.002 0.414 ବ୍ଦ 0.38 ResNet 71.7 ବ୍ଦ 0.1 0.613 ବ୍ଦ 0.019 0.717 ବ୍ଦ 0.001 0.656 ବ୍ଦ 0.009 SA-ResNet 76 . 3 ବ୍ଦ 0 . 1 0 . 695 ବ୍ଦ 0 . 026 0 . 768 ବ୍ଦ 0 . 001 0 . 730 ବ୍ଦ 0 . 015 Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. © 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

www.mdpi.com

Zum Originalartikel

Site-Aware Federated Learning via Embedding and Resampling with Electrocardiograms

Computational Screening of Selected Phytochemicals Against Levofloxacin and Metronidazole-Resistant Indonesian H. pylori Strains

SOWITEC group GmbH verkauft 14,4 MW Windpark an Stadtwerke Stuttgart

Site-Aware Federated Learning via Embedding and Resampling with Electrocardiograms

Computational Screening of Selected Phytochemicals Against Levofloxacin and Metronidazole-Resistant Indonesian H. pylori Strains

SOWITEC group GmbH verkauft 14,4 MW Windpark an Stadtwerke Stuttgart

Prometheus - Die linke Stimme der Schweiz