Federated learning (FL) is an emerging field in the area of distributed machine learning, deep learning, the Internet of Things (IoT), and privacy-preserving algorithms. In this framework, many devices—also called “nodes” or “clients”—collect data from their remote part of the network and train the same ML model architecture independently.
Each node then uploads its model or model update to a server, which creates an aggregate ML-trained system from all of the nodes involved. The data is never shared or uploaded to the server, safeguarding data privacy. Typically, a node is a mobile phone, a sensor that collects data, or an edge device.
FL is gaining ground because of two factors. The first is the increased popularity of IoT devices. The second is the growing concern over data privacy. Massive volumes of data are being aggregated from mobile and edge devices.
Using this data and learning from it can help engineers design improved AI systems with an enhanced user experience. However, users don’t want to compromise their private information, and most are reluctant to have an AI system upload any of their data parameters to another server.
In an FL framework, you collect the data from individual devices and carry out on-device training. There is no need to upload private data to a cloud or central server. This makes the FL framework safe from data leakage, alleviating privacy concerns, while including data from a wide diversity of users from different geographic locations. This model also supports continual ML, where it is constantly improved and updated into a better system.
FL is still in its infancy, with emerging ideas still developing into real-life use cases and applications. Some possible applications that have shown promise include the following.
A single medical site such as a clinic, lab, or hospital does not generate sufficient data samples for training AI models. But sharing such data often requires permissions along with a risk of a possible data breach.
As described in a paper called “Federated Learning of Predictive Models from Federated Electronic Health Records,” researchers used health data from smartphones and hospitals for on-device training. Researchers then aggregated the locally trained models for predicting hospitalizations for patients with heart diseases. No data collection or sharing took place at any point.
Medical images are expensive and difficult to acquire. Along with privacy concerns, it can be time-consuming to share these image files due to their size. In the FL framework, these images can be used for training models on-site for developing an overall aggregated system, which has been optimized from a larger diversity of data. Federated learning brain imaging makes use of locally stored scanned MRI images for training ML models. Another example is federated learning for COVID-19 chest X-ray images, where researchers effectively used an FL framework to more reliably detect COVID-19.
In the FL framework, each self-driving car acts as an edge device that continuously collects data while driving. The data collected by each car includes parameters such as real-time traffic and road conditions, along with telemetry (motion and mapping) data. The idea is to conduct training locally on the car itself and later send scheduled updates to a central server. Such a model supports continual learning and improves the real-time decision making of the self-driving car system.
Google is testing an improved query-suggestion system for its Android virtual keyboard (“Gboard”). In this case, the FL algorithm uses the on-device history to improve the keyboard’s search suggestions. Using a client/server architecture, the server assigns ML tasks to clients, which independently execute them and send the results back to the server. Researchers have addressed issues including network communication, phone battery usage, and ML model updates.
A computer-vision–based safety hazard warning application in smart cities that has been developed on an FL platform shows great promise for future computer-vision applications. The object detection system is developed by aggregating different local object detection models trained on proprietary images owned by different companies. This system has improved operational efficiency while reducing costs and protecting privacy.
In the manufacturing sector, FL has been used to enhance robot welding quality without exposing data from different automotive factories. Similar to other use cases, FL has great potential for the manufacturing industry, where different manufacturing units can collaborate on developing aggregated ML models without compromising the privacy of proprietary data.
While an FL framework offers many advantages in terms of privacy protection, several challenges remain that have to be addressed before it can be broadly adopted.
One of the major hurdles in FL is the communication overhead of model transmission. With mobile or edge devices, sending data over the network can be costly and time-consuming. Instead of uploading the entire model, some systems pass on only the model updates. Also, compression algorithms can be used to reduce communication overhead by up to three orders of magnitude.
In the FL framework, only the model or model update is shared with a central server. However, there is always a risk that the model could be reverse engineered to construct the data, which could then be linked to a specific individual or entity.
One solution is to use homomorphic encryption of transmitted model updates. With homomorphic encryption, the server does not have to apply decryption to construct the aggregated models, safeguarding the data behind the model. However, homomorphic encryption is expensive in terms of the large number of computations required for encoding the model. More efficient methods are required and already in the works.
Heterogeneity of data refers to data diversity. The FL framework supports all types of devices, including mobile phones and IoT devices. The data collected from different sources has many variations within it. There are structural differences; for example, many features or attributes in data accumulated from one device may be missing in the data from another source.
Also, because of location differences, the local samples are bound to be unbalanced, skewed, and statistically different. The identical independently distributed (iid) assumption required for many ML algorithms may not hold in these cases.
Edge devices present many training constraints. For example, local model training can cause mobile phone batteries to drain more quickly, and IoT devices may not have enough computing power to build or update an ML model. Hence, implementing a real-time FL system that continuously updates itself from smaller, resource-constrained devices is a big challenge in itself.
Client security is another major hurdle in implementing FL in a real-life scenario. The nodes can be subject to malicious attack, where an intruder can change the parameters of the local ML model. Such an attack can influence the final aggregated model being constructed at the server end.
Given the privacy concerns and the need for safeguarding users’ data, the FL framework is the way forward. The FL framework will need more efficient communication algorithms that can reliably transmit the same amount of data over lower bandwidth channels.
Also, developers must incorporate security algorithms within all the nodes of the network to avoid malicious attacks on the devices that collect data and train models. Other than network communication, encryption and security algorithms, FL also needs specialized ML algorithms that can adapt themselves to the wide diversity of data needed to build improved aggregated systems.
FL is still in the research stages, with corresponding challenges and limitations. It also requires ML engineers to move away from the traditional regime of centralized learning and adopt new and innovative practices of distributed learning without directly using data and labels. Because FL frameworks can safeguard users and their devices from data leakage, FL is likely to make its way into more everyday products and applications.