MedQA-FoRA-MultiHospital: A Non-IID Multihospital Benchmark and Adaptive Federated Low-Rank Framework for Privacy-Preserving Medical Question Answering and Clinical Report Generation

Zhen Xiong1, Jun Li2
1School of Computer Science and Engineering, Beihang University, China
2The school of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China
DOI: https://doi.org/10.71448/bcds2671-5
Published: 31/03/2026
Cite this article as: Zhen Xiong, Jun Li. MedQA-FoRA-MultiHospital: A Non-IID Multihospital Benchmark and Adaptive Federated Low-Rank Framework for Privacy-Preserving Medical Question Answering and Clinical Report Generation. Bulletin of Computer and Data Sciences, Volume 7 Issue 1. Page: 54-78.

Abstract

Large language models are increasingly attractive for healthcare question answering, longitudinal note generation, discharge communication, and specialty-specific decision support, but their direct deployment in hospitals remains constrained by privacy regulation, compute limitations, heterogeneous local data, and the risk of destructive domain adaptation. This paper introduces \textit{MedQA-FoRA-MultiHospital}, a new multihospital benchmark designed for privacy-preserving adaptation of medical language models under realistic non-identically distributed client partitions. The benchmark contains 50{,}000 medical question–answer pairs and 15{,}000 clinical report-generation instances distributed across five simulated hospital clients with distinct specialty profiles: cardiology, respiratory medicine, neurology, general medicine, and pediatrics. Building on the micro–meso–macro philosophy of efficient federated low-rank fine-tuning, we formulate \textit{Adaptive FoRA}, a heterogeneity-aware extension that preserves the frozen backbone, inserts structured low-rank operators across all major linear maps, and aggregates client updates through divergence-sensitive weighting. Rather than inventing unverified empirical scores, this manuscript contributes a complete benchmark specification, a mathematically explicit training framework, communication and parameter analyses, a privacy threat model, a full experimental protocol, and a reproducibility package structure suitable for subsequent leaderboard development. The paper is intentionally written as a stand-alone benchmark-and-methodology manuscript: it defines the dataset, the task suite, the optimization design, the ablation roadmap, and the evaluation criteria required for rigorous future implementation. We argue that the proposed benchmark fills an important gap between generic medical instruction datasets and privacy-preserving federated adaptation studies by unifying question answering, structured reasoning, and clinical report generation within a single non-IID multihospital setting.

Keywords: federated learning; large language models; medical question answering; clinical report generation; parameter-efficient fine-tuning; non-IID learning; low-rank adaptation; quantization

Abstract

Large language models are increasingly attractive for healthcare question answering, longitudinal note generation, discharge communication, and specialty-specific decision support, but their direct deployment in hospitals remains constrained by privacy regulation, compute limitations, heterogeneous local data, and the risk of destructive domain adaptation. This paper introduces \textit{MedQA-FoRA-MultiHospital}, a new multihospital benchmark designed for privacy-preserving adaptation of medical language models under realistic non-identically distributed client partitions. The benchmark contains 50{,}000 medical question–answer pairs and 15{,}000 clinical report-generation instances distributed across five simulated hospital clients with distinct specialty profiles: cardiology, respiratory medicine, neurology, general medicine, and pediatrics. Building on the micro–meso–macro philosophy of efficient federated low-rank fine-tuning, we formulate \textit{Adaptive FoRA}, a heterogeneity-aware extension that preserves the frozen backbone, inserts structured low-rank operators across all major linear maps, and aggregates client updates through divergence-sensitive weighting. Rather than inventing unverified empirical scores, this manuscript contributes a complete benchmark specification, a mathematically explicit training framework, communication and parameter analyses, a privacy threat model, a full experimental protocol, and a reproducibility package structure suitable for subsequent leaderboard development. The paper is intentionally written as a stand-alone benchmark-and-methodology manuscript: it defines the dataset, the task suite, the optimization design, the ablation roadmap, and the evaluation criteria required for rigorous future implementation. We argue that the proposed benchmark fills an important gap between generic medical instruction datasets and privacy-preserving federated adaptation studies by unifying question answering, structured reasoning, and clinical report generation within a single non-IID multihospital setting.

Keywords: federated learning; large language models; medical question answering; clinical report generation; parameter-efficient fine-tuning; non-IID learning; low-rank adaptation; quantization
Zhen Xiong
School of Computer Science and Engineering, Beihang University, China
Jun Li
The school of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China

DOI

Cite this article as:

Zhen Xiong, Jun Li. MedQA-FoRA-MultiHospital: A Non-IID Multihospital Benchmark and Adaptive Federated Low-Rank Framework for Privacy-Preserving Medical Question Answering and Clinical Report Generation. Bulletin of Computer and Data Sciences, Volume 7 Issue 1. Page: 54-78.

Publication history

Copyright © 2026 Zhen Xiong, Jun Li. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Browse Advance Search