Pubblicazione | GIOVANNI CICCERI | Università degli Studi di Palermo

HERALD: A Hybrid distributEd leaRning incrementAL & feDerated solution for knowledge distillation in COVID-19 classification

Authors: Tricomi, G.; Cicceri, G.; Ficili, I.; Vitabile, S.; Merlino, G.; Puliafito, A.
Publication year: 2026
Type: Articolo in rivista
OA Link: http://hdl.handle.net/10447/686851

Abstract

The COVID-19 pandemic has posed unprecedented challenges to diagnostic accuracy and timeliness, further complicated by the emergence of viral variants that, by producing different damages on patients’ bodies, altered diagnostic imaging patterns and induced data non-stationarity. This has made it difficult to prevent and treat viral infections that can mutate rapidly and be highly contagious. In the context of emergency rooms and infection-detecting facilities, the aforementioned challenges highlight the urgency of designing approaches belonging to the machine learning domain that can identify infections in patients, minimizing the necessity for direct physician intervention. Furthermore, in such environments, privacy-preserving constraints pose a significant barrier to inter-hospital coordination and collaboration, limiting the potential for centralized model training and data sharing. To overcome these barriers, this work introduces a solution (called HERALD, Hybrid distributEd leaRning incrementAL & feDerated), that combines Incremental and Federated Learning paradigms to obtain a double benefit: a never-ending adaptation of the model to the mutated condition used as input in the diagnostic analysis (i.e., derive of diagnostic sensors or due to a virus mutation) and privacy-preserving knowledge dissemination among hospitals, facilitating collaborative learning while maintaining data confidentiality. The proposed solution is meant to mitigate the impact of Catastrophic Forgetting, a common challenge in Continuous Learning and Knowledge Distillation-based approaches. Evidence of the effectiveness of the HERALD is provided by its application to chest X-ray images from patients affected by COVID-19 and healthy individuals. Experimental evaluations were conducted with a lightweight custom CNN architecture on four public COVID-19 radiographic datasets partitioned among three simulated hospital organizations over twelve training batches. Quantitative results show consistent improvements in model performance across 12 training batches, with final accuracy exceeding 92% and recall reaching over 93% at several hospital sites. These results highlight the ability of HERALD to balance local adaptability with global knowledge sharing while preserving data privacy. The proposed solution demonstrates measurable benefits in terms of continual performance optimization, robustness to data variability, and mitigation of Catastrophic Forgetting in decentralized clinical settings.