MedQA-FoRA-MultiHospital: A Non-IID Multihospital Benchmark and Adaptive Federated Low-Rank Framework for Privacy-Preserving Medical Question Answering and Clinical Report Generation

Research article

MedQA-FoRA-MultiHospital: A Non-IID Multihospital Benchmark and Adaptive Federated Low-Rank Framework for Privacy-Preserving Medical Question Answering and Clinical Report Generation

^¹, ^²

¹School of Computer Science and Engineering, Beihang University, China

²The school of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China

Volume 7 Issue 1

DOI: https://doi.org/10.71448/bcds2671-5

Published: 31/03/2026

Cite this article as: Zhen Xiong, Jun Li. MedQA-FoRA-MultiHospital: A Non-IID Multihospital Benchmark and Adaptive Federated Low-Rank Framework for Privacy-Preserving Medical Question Answering and Clinical Report Generation. Bulletin of Computer and Data Sciences, Volume 7 Issue 1. Page: 54-78.

Abstract

Large language models are increasingly attractive for healthcare question answering, longitudinal note generation, discharge communication, and specialty-specific decision support, but their direct deployment in hospitals remains constrained by privacy regulation, compute limitations, heterogeneous local data, and the risk of destructive domain adaptation. This paper introduces \textit{MedQA-FoRA-MultiHospital}, a new multihospital benchmark designed for privacy-preserving adaptation of medical language models under realistic non-identically distributed client partitions. The benchmark contains 50{,}000 medical question–answer pairs and 15{,}000 clinical report-generation instances distributed across five simulated hospital clients with distinct specialty profiles: cardiology, respiratory medicine, neurology, general medicine, and pediatrics. Building on the micro–meso–macro philosophy of efficient federated low-rank fine-tuning, we formulate \textit{Adaptive FoRA}, a heterogeneity-aware extension that preserves the frozen backbone, inserts structured low-rank operators across all major linear maps, and aggregates client updates through divergence-sensitive weighting. Rather than inventing unverified empirical scores, this manuscript contributes a complete benchmark specification, a mathematically explicit training framework, communication and parameter analyses, a privacy threat model, a full experimental protocol, and a reproducibility package structure suitable for subsequent leaderboard development. The paper is intentionally written as a stand-alone benchmark-and-methodology manuscript: it defines the dataset, the task suite, the optimization design, the ablation roadmap, and the evaluation criteria required for rigorous future implementation. We argue that the proposed benchmark fills an important gap between generic medical instruction datasets and privacy-preserving federated adaptation studies by unifying question answering, structured reasoning, and clinical report generation within a single non-IID multihospital setting.

Keywords: federated learning; large language models; medical question answering; clinical report generation; parameter-efficient fine-tuning; non-IID learning; low-rank adaptation; quantization

Abstract

Keywords: federated learning; large language models; medical question answering; clinical report generation; parameter-efficient fine-tuning; non-IID learning; low-rank adaptation; quantization

Zhen Xiong

School of Computer Science and Engineering, Beihang University, China

zhenxiong8768@proton.me

Jun Li

The school of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China

DOI

https://doi.org/10.71448/bcds2671-5

Cite this article as:

Zhen Xiong, Jun Li. MedQA-FoRA-MultiHospital: A Non-IID Multihospital Benchmark and Adaptive Federated Low-Rank Framework for Privacy-Preserving Medical Question Answering and Clinical Report Generation. Bulletin of Computer and Data Sciences, Volume 7 Issue 1. Page: 54-78.

Publication history

Received: 22/01/2026
Revised: 25/02/2026
Accepted: 18/03/2026
Published: 31/03/2026

Copyright © 2026 Zhen Xiong, Jun Li. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.