Collaborative Machine Learning without Centralized Training Data for Federated Learning
Main Article Content
Abstract
Federated learning is a promising approach for collaboratively training machine learning models while keeping the training data decentralized. This paper discusses recent advances and open problems in federated learning, focusing on the challenge of communication efficiency and the heterogeneous nature of data, models, and objectives among participating clients. Federated learning allows clients to jointly train a machine learning model without centralizing their private training data. Instead, each client computes an update to the current global model based on their local data, and only this update is communicated to a central server for aggregation. This paradigm is appealing for privacy-sensitive applications, as it avoids the risks associated with centralized data storage. However, federated learning faces several unique challenges compared to traditional centralized machine learning. The heterogeneous nature of the data, models, and objectives across different clients can lead to conflicts and slow convergence of the global model. Furthermore, communication efficiency is critical, as clients typically have unreliable and relatively slow network connections. Recent work has proposed various strategies to improve the communication efficiency of federated learning, such as model compression techniques and selective client participation. Other research has explored ways to handle the heterogeneous nature of federated learning, for example, by allowing clients to train their customized models and share them with the federation.
Article Details
How to Cite
References
McMahan, H B., Moore, E., Ramage, D., & Arcas, B A Y. (2016, February 17). Federated Learning of Deep Networks using Model Averaging. Cornell University. https://arxiv.org/pdf/1602.05629v1
Konečný, J., McMahan, H B., Yu, F X., Richtárik, P., Suresh, A T., & Bacon, D. (2016, January 1). Federated Learning: Strategies for Improving Communication Efficiency. Cornell University. https://doi.org/10.48550/arxiv.1610.05492
Wang, J., Charles, Z., Xu, Z., Joshi, G., McMahan, H B., Arcas, B A Y., Al-Shedivat, M., Andrew, G., Avestimehr, S., Daly, K., Data, D., Diggavi, S., Eichner, H., Gadhikar, A., Garrett, Z., Girgis, A M., Hanzely, F., Hard, A., He, C., . . . Zhu, W. (2021, January 1). A Field Guide to Federated Optimization. Cornell University. https://doi.org/10.48550/arxiv.2107.06917
Shen, T., Zhang, J., Jia, X., Zhang, F., Huang, G., Zhou, P., Wu, F., & Wu, C. (2020, January 1). Federated Mutual Learning. Cornell University. https://doi.org/10.48550/arxiv.2006.16765
Li, T., Sahu, A K., Talwalkar, A., & Smith, V. (2020, May 1). Federated Learning: Challenges, Methods, and Future Directions. Institute of Electrical and Electronics Engineers, 37(3), 50-60
Bonawitz, K., Eichner, H., Grieskamp, W., Huba, D., Ingerman, A., Ivanov, V., Kiddon, C., Konečný, J., Mazzocchi, S., McMahan, H B., Overveldt, T V., Petrou, D., Ramage, D., & Roselander, J. (2019, January 1). Towards Federated Learning at Scale: System Design. Cornell University. https://doi.org/10.48550/arXiv.1902.
Dinh, C T., Tran, N H., Nguyen, M N H., Hong, C S., Bao, W., Zomaya, A Y., & Gramoli, V. (2021, February 1). Federated Learning Over Wireless Networks: Convergence Analysis and Resource Allocation. Institute of Electrical and Electronics Engineers, 29(1), 398-409. https://doi.org/10.1109/tnet.2020.3035770.
Zhang, X., Zhu, X., Wang, J., Yan, H., Chen, H., & Bao, W. (2020, November 1). Federated learning with adaptive communication compression under dynamic bandwidth and unreliable networks. Elsevier BV, 540, 242-262. https://doi.org/10.1016/j.ins.2020.05.137.
Nguyen, D. C., Ding, M., Pathirana, P. N., Seneviratne, A., Li, J., & Poor, H. V. (2021). Federated learning for internet of things: A comprehensive survey. IEEE Communications Surveys & Tutorials, 23(3), 1622-1658.
Ma, X., Zhu, J., Lin, Z., Chen, S., & Qin, Y. (2022). A state-of-the-art survey on solving non-iid data in federated learning. Future Generation Computer Systems, 135, 244-258.
Zhu, H., Xu, J., Liu, S., & Jin, Y. (2021). Federated learning on non-IID data: A survey. Neurocomputing, 465, 371-390.
Zhang, X., Hong, M., Dhople, S., Yin, W., & Liu, Y. (2021). Fedpd: A federated learning framework with adaptivity to non-iid data. IEEE Transactions on Signal Processing, 69, 6055-6070.
Sun, T., Li, D., & Wang, B. (2022). Decentralized federated averaging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4), 4289-4301.
Karimireddy, S. P., Kale, S., Mohri, M., Reddi, S., Stich, S., & Suresh, A. T. (2020, November). Scaffold: Stochastic controlled averaging for federated learning. In International conference on machine learning (pp. 5132-5143). PMLR.
Chai, Z., Ali, A., Zawad, S., Truex, S., Anwar, A., Baracaldo, N., ... & Cheng, Y. (2020, June). Tifl: A tier-based federated learning system. In Proceedings of the 29th international symposium on high-performance parallel and distributed computing (pp. 125-136).
Diao, E., Ding, J., & Tarokh, V. (2020). Heterofl: Computation and communication efficient federated learning for heterogeneous clients. arXiv preprint arXiv:2010.01264.
Qu, L., Zhou, Y., Liang, P. P., Xia, Y., Wang, F., Adeli, E., ... & Rubin, D. (2022). Rethinking architecture design for tackling data heterogeneity in federated learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10061-10071).
Ghosh, A., Hong, J., Yin, D., & Ramchandran, K. (2019). Robust federated learning in a heterogeneous environment. arXiv preprint arXiv:1906.06629.
Li, D., & Wang, J. (2019). Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581.
Caldas, S., Duddu, S. M. K., Wu, P., Li, T., Konečný, J., McMahan, H. B., ... & Talwalkar, A. (2018). Leaf: A benchmark for federated settings. arXiv preprint arXiv:1812.01097.
Beikmohammadi, A., Faez, K., & Motallebi, A. (2022). SWP-LeafNET: A novel multistage approach for plant leaf identification based on deep CNN. Expert Systems with Applications, 202, 117470.
Roy, A. G., Siddiqui, S., Pölsterl, S., Navab, N., & Wachinger, C. (2019). Braintorrent: A peer-to-peer environment for decentralized federated learning. arXiv preprint arXiv:1905.06731.
Nguyen, D. C., Ding, M., Pathirana, P. N., Seneviratne, A., Li, J., Niyato, D., & Poor, H. V. (2021). Federated learning for industrial internet of things in future industries. IEEE Wireless Communications, 28(6), 192-199.
Banabilah, S., Aloqaily, M., Alsayed, E., Malik, N., & Jararweh, Y. (2022). Federated learning review: Fundamentals, enabling technologies, and future applications. Information processing & management, 59(6), 103061.
Pham, Q. V., Dev, K., Maddikunta, P. K. R., Gadekallu, T. R., & Huynh-The, T. (2021). Fusion of federated learning and industrial internet of things: a survey. arXiv preprint arXiv:2101.00798.
Xie, H., Ma, J., Xiong, L., & Yang, C. (2021). Federated graph classification over non-iid graphs. Advances in neural information processing systems, 34, 18839-18852.
Rahman, S. A., Tout, H., Talhi, C., & Mourad, A. (2020). Internet of things intrusion detection: Centralized, on-device, or federated learning?. IEEE Network, 34(6), 310-317.
Savazzi, S., Nicoli, M., & Rampa, V. (2020). Federated learning with cooperating devices: A consensus approach for massive IoT networks. IEEE Internet of Things Journal, 7(5), 4641-4654.
Ramu, S. P., Boopalan, P., Pham, Q. V., Maddikunta, P. K. R., Huynh-The, T., Alazab, M., ... & Gadekallu, T. R. (2022). Federated learning enabled digital twins for smart cities: Concepts, recent advances, and future directions. Sustainable Cities and Society, 79, 103663.
Boobalan, P., Ramu, S. P., Pham, Q. V., Dev, K., Pandya, S., Maddikunta, P. K. R., ... & Huynh-The, T. (2022). Fusion of federated learning and industrial Internet of Things: A survey. Computer Networks, 212, 109048.
Liu, M., Ho, S., Wang, M., Gao, L., Jin, Y., & Zhang, H. (2021). Federated learning meets natural language processing: A survey. arXiv preprint arXiv:2107.12603.
Deng, J., Wang, C., Meng, X., Wang, Y., Li, J., Lin, S., ... & Ding, C. (2022). A secure and efficient federated learning framework for nlp. arXiv preprint arXiv:2201.11934.
Abreha, H. G., Hayajneh, M., & Serhani, M. A. (2022). Federated learning in edge computing: a systematic survey. Sensors, 22(2), 450.
Feng, C., Zhao, Z., Wang, Y., Quek, T. Q., & Peng, M. (2021). On the design of federated learning in the mobile edge computing systems. IEEE Transactions on Communications, 69(9), 5902-5916.
Arivazhagan, M. G., Aggarwal, V., Singh, A. K., & Choudhary, S. (2019). Federated learning with personalization layers. arXiv preprint arXiv:1912.00818.
Truex, S., Baracaldo, N., Anwar, A., Steinke, T., Ludwig, H., Zhang, R., & Zhou, Y. (2019, November). A hybrid approach to privacy-preserving federated learning. In Proceedings of the 12th ACM workshop on artificial intelligence and security (pp. 1-11).
Yin, X., Zhu, Y., & Hu, J. (2021). A comprehensive survey of privacy-preserving federated learning: A taxonomy, review, and future directions. ACM Computing Surveys (CSUR), 54(6), 1-36.
Gudur, G. K., Balaji, B. S., & Perepu, S. K. (2020). Resource-constrained federated learning with heterogeneous labels and models. arXiv preprint arXiv:2011.03206.
Tolpegin, V., Truex, S., Gursoy, M. E., & Liu, L. (2020). Data poisoning attacks against federated learning systems. In Computer security–ESORICs 2020: 25th European symposium on research in computer security, ESORICs 2020, guildford, UK, September 14–18, 2020, proceedings, part i 25 (pp. 480-501). Springer International Publishing.
Huang, C., Huang, J., & Liu, X. (2022). Cross-silo federated learning: Challenges and opportunities. arXiv preprint arXiv:2206.12949.
Chu, L., Wang, L., Dong, Y., Pei, J., Zhou, Z., & Zhang, Y. (2021). Fedfair: Training fair models in cross-silo federated learning. arXiv preprint arXiv:2109.05662.
Luo, J., & Wu, S. (2022, July). Adapt to adaptation: Learning personalization for cross-silo federated learning. In IJCAI: proceedings of the conference (Vol. 2022, p. 2166). NIH Public Access.