TransFed:: A way to epitomize Focal Modulation using Transformer-based Federated Learning

MICCAI 2024
1Indian Institute of Technology Delhi, 2National Institute of Technology Delhi,
Deep Breath

Abstract

Federated learning has emerged as a promising paradigm for collaborative machine learning, enabling multiple clients to train a model while preserving data privacy jointly. Tailored federated learning takes this concept further by accommodating client heterogeneity and facilitating the learning of personalized models. While the utilization of transformers within federated learning has attracted significant interest, there remains a need to investigate the effects of federated learning algorithms on the latest focal modulation-based transformers. In this paper, we investigate this relationship and uncover the detrimental effects of federated averaging (FedAvg) algorithms on Focal Modulation, particularly in scenarios with heterogeneous data. To address this challenge, we propose TransFed, a novel transformer-based federated learning framework that not only aggregates model parameters but also learns tailored Focal Modulation for each client. Instead of employing a conventional customization mechanism that maintains client-specific focal modulation layers locally, we introduce a learn-to-tailor approach that fosters client collaboration, enhancing scalability and adaptation in TransFed. Our method incorporates a hyper network on the server, responsible for learning personalized projection matrices for the focal modulation layers. This enables the generation of client-specific keys, values, and queries.Furthermore, we provide an analysis of adaptation bounds for TransFed using the learn-to-customize mechanism. Through intensive experiments on datasets related to pneumonia classification, we demonstrate that TransFed, in combination with the learn-to-tailor approach, achieves superior performance in scenarios with non-IID data distributions, surpassing existing methods. Overall, TransFed paves the way for leveraging focal Modulation in federated learning, advancing the capabilities of focal modulated transformer models in decentralized environments

Trans-Fed Architecture

We introduce TransFed, an innovative federated learning framework built upon Focal modulation architecture. TransFed directly addresses the limitations of FedAvg when applied to focal modulation in heterogeneous data scenarios. By facilitating the customization of focal modulation for individual clients, TransFed significantly improves performance within the context of tailored federated learning. Our proposal introduces a learn-to-tailor concept to enhance the utilization of client cooperation in the tailored layers. This mechanism aims to improve the scalability and adaptation capabilities of TransFed.

TransFed Algorithm

Quantitative Results

Table 2. demonstrates The TransFed method’s average test accuracy is computed alongside that of multiple transformer-based approaches, encompassing different non-IID scenarios.

Ablation Study


BibTeX

@inproceedings{ashraf2024transfed,
  title={TransFed: A way to epitomize Focal Modulation using Transformer-based Federated Learning},
  author={Ashraf, Tajamul and Bin Afzal Mir, Fuzayil and Gillani, Iqra Altaf},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={554--563},
  year={2024}
}