Dynamic Hyper-Parameter Adjustment in Fedprox for Improving Performance in Multi-Task Federated Learning Systems

Kaili Wang

doi:10.54097/bje0ns57

Authors

Kaili Wang

DOI:

https://doi.org/10.54097/bje0ns57

Keywords:

Fedsc-Mtl Algorithm, Feature-Based Distillation, Communication Efficiency.

Abstract

This paper presents FedSC-MTL, a unified framework for multi-task federated learning (MTL) that incorporates dynamic regularization and dual control stabilization to address key challenges such as data heterogeneity, task divergence, and knowledge misalignment. While existing methods like FedProx and SCAFFOLD focus on data heterogeneity and client drift, they fail to manage task-level goals and knowledge transfer across clients effectively. To overcome this, FedSC-MTL introduces a novel dual control mechanism: the control variable 'c' stabilizes parameter updates and mitigates client drift, while the feature-level distillation control 'd' enables efficient knowledge transfer. By focusing on feature-based distillation, FedSC-MTL reduces communication costs compared to logits-level distillation methods like FedICT and FedDyn, without sacrificing task-level generalization. The effectiveness of this approach is evaluated on multi-task federated benchmarks, including EMNIST, CIFAR-10, and Fashion-MNIST datasets. Experimental results show that FedSC-MTL outperforms existing methods in terms of accuracy, convergence speed, and communication efficiency. Specifically, the feature-level distillation mechanism significantly improves cross-task generalization, and the dual control stabilization enhances the overall stability and convergence of the model. These findings demonstrate the potential of FedSC-MTL as an effective solution for multi-task federated learning in heterogeneous environments, offering both scalability and efficiency.

References

[1] Li, T. et al.: 'Federated optimization in heterogeneous networks'. Proc. MLSys, 2020, vol. 2, pp. 429–450

[2] Karimireddy, S.P. et al.: 'SCAFFOLD: Stochastic controlled averaging for federated learning'. Report, arXiv:1910.06378, 2021

[3] Wen, J. et al.: 'FedICT: Federated multi-task distillation'. Report, arXiv:2301.00389, 2023.

[4] Cheng, J., Xuandong, C., Yi, G., Qun, L.: 'FedDyn: A dynamic and efficient federated distillation approach on Recommender System'. Proc. Int. Conf. Parallel and Distributed Systems (ICPADS), 2023

[5] Gao, L. et al.: 'Federated learning with non-IID data via local drift decoupling'. Report, arXiv:2203.11751, 2022.

[6] Li, Q. et al.: 'Model-contrastive federated learning'. Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021

[7] He, C. et al.: 'Group knowledge transfer'. Report, arXiv:2007.14513, 2020.

[8] Xiang, L., Kaixuan, H., Wenhao, Y., Shusen, W., Zhihua, Z.: 'On the Convergence of FedAvg on Non-IID Data'. Report, arXiv:1907.02189, 2019.

[9] Cohen, G., Afshar, S., Tapson, J., van Schaik, A.: 'EMNIST: Extending MNIST to handwritten letters'. Proc. Int. Joint Conf. Neural Netw. (IJCNN), 2017, pp. 2921–2926.

[10] Krizhevsky, A., Hinton, G.: 'Learning multiple layers of features from tiny images'. Report, University of Toronto, 2009.

[11] Xiao, H., Rasul, K., Vollgraf, R.: 'Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms'. Report, arXiv:1708.07747, 2017.

[12] Xiang, L., Kaixuan, H., Wenhao, Y., Shusen, W., Zhihua, Z.: 'On the Convergence of FedAvg on Non-IID Data'. Report, arXiv:1907.02189, 2019.

[13] Jed, M., Jia, H., Geyong, M.: 'Multi-Task Federated Learning for Personalised Deep Neural Networks in Edge Computing', in: 'IEEE Transactions on Parallel and Distributed Systems' (IEEE, 2021).

[14] Wei, W., Haojie, L., Zhengming, D., Feiping, N., Junyang, C., Xiao, D.: 'Rethinking Maximum Mean Discrepancy for Visual Domain Adaptation', in: 'IEEE Transactions on Neural Networks and Learning Systems' (IEEE, 2021).

[15] Peipei, X., Li, Z., Fanzhang, L.: 'Learning similarity with cosine similarity ensemble', in: 'Information Sciences' (Elsevier, 2015).