DESCRIPTION :
The position is part of a new Marie Curie Training Network called FINALITY, in which Inria joins forces with top universities and industries, including IMDEA, KTH, TU Delft, the University of Avignon (Project Leader), the Cyprus Institute, Nokia, Telefonica, Ericsson, Orange, and others. The PhD students will have opportunities for internships with other academic and industry partners and will be able to participate in thematic summer schools and workshops organized by the project.
Only people who have spent less than one year in France in the last 3 years are eligible.
The candidate will receive a monthly living allowance of about €2,735, a mobility allowance of €414, and, if applicable, a family allowance of €458 (gross amounts).
Mission confiée
Federated Learning (FL) empowers a multitude of IoT devices, including mobile phones and sensors, to collaboratively train a global machine learning model while retaining their data locally [1,2]. A prominent example of FL in action is Google's Gboard, which uses a FL-trained model to predict subsequent user inputs on smartphones [3].
Two primary challenges arise during the training phase of FL [4]:
Data Privacy: Ensuring user data remains confidential. Even though the data is kept locally by the devices, it has been shown that an honest-but-curious server can still reconstruct data samples [5,6], sensitive attributes [7,8], and the local model [9] of a targeted device. In addition, the server can conduct membership inference attacks [10] to identify whether a data sample is involved in the training or source inference attacks to determine which device stores a given data sample [11].
Security Against Malicious Participants: Ensuring the learning process is not derailed by harmful actors. Recent research has demonstrated that, in the absence of protective measures, a malicious agent can deteriorate the model performance by simply flipping the labels [12] and/or the sign of the gradient [13] and even inject backdoors into the model [14] (backdoors are hidden vulnerabilities, which can be exploited under certain conditions predefined by the attacker, like some specific inputs).
Differentially private algorithms [15] have been proposed to tackle the challenges of protecting user privacy. These algorithms work by clipping the gradients and adding noise to them before the transmission, ensuring that minor alterations in a user's training dataset will not be discernible to potential adversaries [16,17,18,19,20]. By leveraging the differentially private mechanisms, [19] shows that adversaries are unable to deduce the exact local information of vehicles for the applications such as Uber. Furthermore, [20] demonstrates that the quality of data reconstruction attack is significantly reduced when training a convolutional neural network on CIFAR-10 dataset.
To enhance system security against adversarial threats, Byzantine resilient mechanisms are implemented on the server side. These algorithms are designed to identify and mitigate potentially detrimental actions or inputs from users, ensuring that even if some components act maliciously or erratically, the overall system remains functional and secure [21,22,23,24]. Experiments [21] reveal that integrating these Byzantine resilient mechanisms sustains neural network accuracy at 90.7%, even when 10% of the agents maliciously flip the labels on the MNIST dataset. In contrast, without such protection, the accuracy of the neural network drops significantly to 77.3%.
Integrating differential privacy with Byzantine resilience presents a notable challenge. Recent research suggests that when these two security measures are combined in their current forms, the effectiveness of the resulting algorithm disproportionately depends on the number of parameters in the machine learning model (d) [25]. In particular, it requires either the batch size to grow linearly with the square root of d, or the proportion of the malicious agents in the system to decrease with rate inversely proportional to the square root of d. For a realistic model such as ResNet-50 (with around 25 million parameters), the batch size should be larger than 5000, which is clearly impractical. To tackle this problem, novel Byzantine resilient algorithms have been recently proposed [26,27]. However, these algorithms encounter significant computational complexity, with a rate of at least d^3 in each communication round. Hence, there is a pressing
need for innovative methods that can seamlessly integrate differential privacy and Byzantine resilience with low computational complexity to train practical neural networks.
Project objective
The goal of this PhD is to propose novel FL algorithms to effectively tackle these two mutually linked challenges. In particular, we want to explore the potentialities of compression for FL training, as these techniques can highly reduce the model dimension d, which may provide a solution for a computation-efficient private and secure FL system.
Compression techniques were initially introduced to alleviate communication costs in distributed training processes, where only a proportion of model parameters are sent from the device to the server in each communication round [28,29,30]. The primary objective of compression design is to ensure a communication-efficient machine learning/FL system, by providing model parameters selection rules at the device side which optimize the trained model performance under a given communication budget. [31,32] combined Byzantine resilient methods with compression, to ensure a communication-efficient secure FL system. However, in these studies, even though devices transmit compressed models to the server, Byzantines resilient methods still operate on the full models of dimension d. Consequently, adopting their solutions to build a private and secure FL system still requires high computation load.
The goal of this PhD is to investigate the impact of compression strategies on the trade-offs among privacy, robustness, computational efficiency, and model performance, with the aim of designing novel compression techniques for a computationally efficient, private, and secure federated learning system.
References
[1] McMahan et al, Communication-Efficient Learning of Deep Networks from Decentralized Data, AISTATS 2017