Publications

Hu P, Bordignon V, Vlaski S, Sayed AHet al., 2023, Optimal Aggregation Strategies for Social Learning Over Graphs, IEEE Transactions on Information Theory, Vol: 69, Pages: 6048-6070, ISSN: 0018-9448

Adaptive social learning is a useful tool for studying distributed decision-making problems over graphs. This paper investigates the effect of combination policies on the performance of adaptive social learning strategies. Using large-deviation analysis, it first derives a bound on the steady-state error probability and characterizes the optimal selection for the Perron eigenvectors of the combination policies. It subsequently studies the effect of the combination policy on the transient behavior of the learning strategy by estimating the adaptation time in the low signal-to-noise ratio regime. In the process, it is discovered that, interestingly, the influence of the combination policy on the transient behavior is insignificant, and thus it is more critical to employ policies that enhance the steady-state performance. The theoretical conclusions are illustrated by means of computer simulations.

Abstract
Cite

Journal article

Rizk E, Vlaski S, Sayed AH, 2023, Privatized graph federated learning, EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, Vol: 2023, ISSN: 1687-6180

Journal article

Ntemos K, Bordignon V, Vlaski S, Sayed AHet al., 2023, Self-Aware Social Learning Over Graphs, IEEE TRANSACTIONS ON INFORMATION THEORY, Vol: 69, Pages: 5299-5317, ISSN: 0018-9448

Journal article

Vlaski S, Kar S, Sayed AH, Moura JMFet al., 2023, Networked Signal and Information Processing: Learning by multiagent systems, IEEE SIGNAL PROCESSING MAGAZINE, Vol: 40, Pages: 92-105, ISSN: 1053-5888

Journal article

Bordignon V, Vlaski S, Matta V, Sayed AHet al., 2023, Learning from heterogeneous data based on social interactions over graphs, IEEE Transactions on Information Theory, Vol: 69, Pages: 3347-3371, ISSN: 0018-9448

This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions and arising from possibly different distributions. In the context of social learning, several useful strategies have been developed, which solve decision making problems through local cooperation across distributed agents and allow them to learn from streaming data. However, traditional social learning strategies rely on the fundamental assumption that each agent has significant prior knowledge of the underlying distribution of the observations. In this work we overcome this issue by introducing a machine learning framework that exploits social interactions over a graph, leading to a fully data-driven solution to the distributed classification problem. In the proposed social machine learning (SML) strategy, two phases are present: in the training phase, classifiers are independently trained to generate a belief over a set of hypotheses using a finite number of training samples; in the prediction phase, classifiers evaluate streaming unlabeled observations and share their instantaneous beliefs with neighboring classifiers. We show that the SML strategy enables the agents to learn consistently under this highly-heterogeneous setting and allows the network to continue learning even during the prediction phase when it is deciding on unlabeled samples. The prediction decisions are used to continually improve performance thereafter in a manner that is markedly different from most existing static classification schemes where, following training, the decisions on unlabeled data are not re-used to improve future performance.

Journal article

Vlaski S, Kar S, Sayed AH, Moura JMFet al., 2023, Networked signal and information processing, IEEE: Signal Processing Magazine, ISSN: 1053-5888

The article reviews significant advances in networked signal and informationprocessing, which have enabled in the last 25 years extending decision makingand inference, optimization, control, and learning to the increasinglyubiquitous environments of distributed agents. As these interacting agentscooperate, new collective behaviors emerge from local decisions and actions.Moreover, and significantly, theory and applications show that networkedagents, through cooperation and sharing, are able to match the performance ofcloud or federated solutions, while offering the potential for improvedprivacy, increasing resilience, and saving resources.

Journal article

Schroth CA, Vlaski S, Zoubir AM, 2023, Attacks on Robust Distributed Learning Schemes via Sensitivity Curve Maximization

Distributed learning paradigms, such as federated or decentralized learning, allow a collection of agents to solve global learning and optimization problems through limited local interactions. Most such strategies rely on a mixture of local adaptation and aggregation steps, either among peers or at a central fusion center. Classically, aggregation in distributed learning is based on averaging, which is statistically efficient, but susceptible to attacks by even a small number of malicious agents. This observation has motivated a number of recent works, which develop robust aggregation schemes by employing robust variations of the mean. We present a new attack based on sensitivity curve maximization (SCM), and demonstrate that it is able to disrupt existing robust aggregation schemes by injecting small, but effective perturbations.

Abstract
Cite

Conference paper

Rizk E, Vlaski S, Sayed AH, 2023, Local Graph-Homomorphic Processing for Privatized Distributed Systems, ISSN: 1520-6149

We study the generation of dependent random numbers in a distributed fashion in order to enable privatized distributed learning by networked agents. We propose a method that we refer to as local graph-homomorphic processing; it relies on the construction of particular noises over the edges to ensure a certain level of differential privacy. We show that the added noise does not affect the performance of the learned model. This is a significant improvement to previous works on differential privacy for distributed algorithms, where the noise was added in a less structured manner without respecting the graph topology and has often led to performance deterioration. We illustrate the theoretical results by considering a linear regression problem over a network of agents.

Abstract
Cite

Conference paper

Wang C, Vlaski S, 2023, Robust Network Topologies for Distributed Learning, ISSN: 1520-6149

The robustness of networks against malicious agents is a critical issue for their reliability in distributed learning. While a significant number of works in recent years have investigated the development of robust algorithms for distributed learning, few have examined the influence and design of the underlying network topology on robustness. Robust schemes for distributed learning typically require certain conditions on the arrangement of malicious agents in the network. In particular, the majority of neighbors of any benign agent must be benign, and the subgraph of benign agents must be connected. In this work, we propose a scheme for the design of such topologies based on prior information of the risk profile of participating agents. We show that the resulting topology is asymptotically almost surely connected and benign agents have majority benign neighborhoods. At the same time, the proposed design asymptotically tolerates a fraction of malicious agents arbitrarily close to one, while risk agnostic designs, such as complete graphs, break down as soon as the majority of agents is malicious.

Abstract
Cite

Conference paper

Wadehra S, Nassif R, Vlaski S, 2023, Exact Subspace Diffusion for Decentralized Multitask Learning, Pages: 6172-6179, ISSN: 0743-1546

Classical paradigms for distributed learning, such as federated or decentralized gradient descent, employ consensus mechanisms to enforce homogeneity among agents. While these strategies have proven effective in i.i.d. scenarios, they can result in significant performance degradation when agents follow heterogeneous objectives or data. Distributed strategies for multitask learning, on the other hand, induce relationships between agents in a more nuanced manner, and en-courage collaboration without enforcing consensus. We develop a generalization of the exact diffusion algorithm for subspace constrained multitask learning over networks, and derive an accurate expression for its mean-squared deviation when utilizing noisy gradient approximations. We verify numerically the accuracy of the predicted performance expressions, as well as the improved performance of the proposed approach over alternatives based on approximate projections.

Abstract
Cite

Conference paper

Schroth CA, Vlaski S, Zoubir AM, 2023, Robust M-Estimation Based Distributed Expectation Maximization Algorithm with Robust Aggregation, ISSN: 1520-6149

Distributed networks are widely used in industrial and consumer applications. As the communication capabilities of such networks are usually limited, it is important to develop algorithms which are capable of handling the vast amount of data processing locally and only communicate some aggregated value. Additionally, these algorithms have to be robust against outliers in the data, as well as faulty or malicious nodes. Thus, we propose a robust distributed expectation maximization (EM) algorithm based on Real Elliptically Symmetric (RES) distributions, which is highly adaptive to outliers and moreover is combined with a robust data aggregation step which provides robustness against malicious nodes. In the simulations, the proposed algorithm shows its effectiveness over non-robust methods.

Abstract
Cite

Conference paper

Cao Y, Rizk E, Vlaski S, Sayed AHet al., 2023, Multi-Agent Adversarial Training Using Diffusion Learning, ISSN: 1520-6149

This work focuses on adversarial learning over graphs. We propose a general adversarial training framework for multi-agent systems using diffusion learning. We analyze the convergence properties of the proposed scheme for convex optimization problems, and illustrate its enhanced robustness to adversarial attacks.

Abstract
Cite

Conference paper

Rizk E, Vlaski S, Sayed AH, 2023, Enforcing Privacy in Distributed Learning With Performance Guarantees, IEEE TRANSACTIONS ON SIGNAL PROCESSING, Vol: 71, Pages: 3385-3398, ISSN: 1053-587X

Cite

Journal article

Nassif R, Vlaski S, Carpentiero M, Matta V, Antonini M, Sayed AHet al., 2023, Quantization for Decentralized Learning Under Subspace Constraints, IEEE TRANSACTIONS ON SIGNAL PROCESSING, Vol: 71, Pages: 2320-2335, ISSN: 1053-587X

Author Web Link
Cite
Citations: 1

Journal article

Jegnell S, Vlaski S, 2022, Distributed relatively smooth optimization, 2022 IEEE 61st Conference on Decision and Control (CDC), Publisher: IEEE, Pages: 6511-6517

Smoothness conditions, either on the cost itself or its gradients, are ubiquitous in the development and study of gradient-based algorithms for optimization and learning. In the context of distributed optimization and multi-agent systems, smoothness conditions and gradient bounds are additionally central to controlling the effect of local heterogeneity. We deviate from this paradigm and study distributed learning problems in relatively smooth environments, where cost functions may grow faster than a quadratic, and gradients need not be bounded. We generalize gradient noise conditions to cover this setting, and present convergence guarantees in relatively smooth and relatively convex environments. Numerical results corroborate the findings.

Conference paper

Vlaski S, Sayed AH, 2022, Second-order guarantees of stochastic gradient descent in non-convex optimization, IEEE Transactions on Automatic Control, Vol: 67, Pages: 6489-6504, ISSN: 0018-9286

Recent years have seen increased interest in performance guarantees of gradient descent algorithms for non-convex optimization. A number of works have uncovered that gradient noise plays a critical role in the ability of gradient descent recursions to efficiently escape saddle-points and reach second-order stationary points. Most available works limit the gradient noise component to be bounded with probability one or sub-Gaussian and leverage concentration inequalities to arrive at high-probability results. We present an alternate approach, relying primarily on mean-square arguments and show that a more relaxed relative bound on the gradient noise variance is sufficient to ensure efficient escape from saddle-points without the need to inject additional noise, employ alternating step-sizes or rely on a global dispersive noise assumption, as long as a gradient noise component is present in a descent direction for every saddle-point.

Journal article

Vlaski S, Vandenberghe L, Sayed AH, 2022, Regularized diffusion adaptation via conjugate smoothing, IEEE Transactions on Automatic Control, Vol: 67, Pages: 2343-2358, ISSN: 0018-9286

The purpose of this work is to develop and study a decentralized strategy for Pareto optimization of an aggregate cost consisting of regularized risks. Each risk is modeled as the expectation of some loss function with unknown probability distribution while the regularizers are assumed deterministic, but are not required to be differentiable or even continuous. The individual, regularized, cost functions are distributed across a strongly-connected network of agents and the Pareto optimal solution is sought by appealing to a multi-agent diffusion strategy. To this end, the regularizers are smoothed by means of infimal convolution and it is shown that the Pareto solution of the approximate, smooth problem can be made arbitrarily close to the solution of the original, non-smooth problem. Performance bounds are established under conditions that are weaker than assumed before in the literature, and hence applicable to a broader class of adaptation and learning problems.

Journal article

Vlaski S, Schroth C, Muma M, Zoubir AMet al., 2022, Robust and efficient aggregation for distributed learning, 30th European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 817-821, ISSN: 2076-1465

Distributed learning paradigms, such as federated and decentralized learning, allow for the coordination of models across a collection of agents, and without the need to exchange raw data. Instead, agents compute model updates locally based on their available data, and subsequently share the update model with a parameter server or their peers. This is followed by an aggregation step, which traditionally takes the form of a (weighted) average. Distributed learning schemes based on averaging are known to be susceptible to outliers. A single malicious agent is able to drive an averaging-based distributed learning algorithm to an arbitrarily poor model. This has motivated the development of robust aggregation schemes, which are based on variations of the median and trimmed mean. While such procedures ensure robustness to outliers and malicious behavior, they come at the cost of significantly reduced sample efficiency. This means that current robust aggregation schemes require significantly higher agent participation rates to achieve a given level of performance than their mean-based counterparts in non-contaminated settings. In this work we remedy this drawback by developing statistically efficient and robust aggregation schemes for distributed learning.

Conference paper

Kayaalp M, Vlaski S, Sayed A, 2022, Dif-MAML: Decentralized multi-agent meta-learning, IEEE Open Journal of Signal Processing, Vol: 3, Pages: 71-93, ISSN: 2644-1322

The objective of meta-learning is to exploit knowledge obtained from observed tasks to improve adaptation to unseen tasks. Meta-learners are able to generalize better when they are trained with a larger number of observed tasks and with a larger amount of data per task. Given the amount of resources that are needed, it is generally difficult to expect the tasks, their respective data, and the necessary computational capacity to be available at a single central location. It is more natural to encounter situations where these resources are spread across several agents connected by some graph topology. The formalism of meta-learning is actually well-suited for this decentralized setting, where the learner benefits from information and computational power spread across the agents. Motivated by this observation, we propose a cooperative fully-decentralized multi-agent meta-learning algorithm, referred to as Diffusion-based MAML or Dif-MAML. Decentralized optimization algorithms are superior to centralized implementations in terms of scalability, robustness, avoidance of communication bottlenecks, and privacy guarantees. The work provides a detailed theoretical analysis to show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML objective even in non-convex environments. Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.

Journal article

Rizk E, Vlaski S, Sayed AH, 2022, Federated Learning Under Importance Sampling, IEEE TRANSACTIONS ON SIGNAL PROCESSING, Vol: 70, Pages: 5381-5396, ISSN: 1053-587X

Author Web Link
Cite
Citations: 4

Journal article

Nassif R, Bordignon V, Vlaski S, Sayed AHet al., 2022, DECENTRALIZED LEARNING IN THE PRESENCE OF LOW-RANK NOISE, 47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 5667-5671, ISSN: 1520-6149

Conference paper

Hu P, Bordignon V, Vlaski S, Saye AHet al., 2022, OPTIMAL COMBINATION POLICIES FOR ADAPTIVE SOCIAL LEARNING, 47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 5842-5846, ISSN: 1520-6149

Author Web Link
Cite
Citations: 1

Conference paper

Kayaalp M, Bordignon V, Vlaski S, Sayed AHet al., 2022, HIDDEN MARKOV MODELING OVER GRAPHS, IEEE Data Science and Learning Workshop (DSLW), Publisher: IEEE

Conference paper

Ntemos K, Bordignon V, Vlaski S, Sayed AHet al., 2022, Social Learning with Disparate Hypotheses, 30th European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 2171-2175, ISSN: 2076-1465

Author Web Link
Cite
Citations: 1

Conference paper

Nassif R, Vlaski S, Antonini M, Carpentiero M, Matta V, Sayed AHet al., 2022, FINITE BIT QUANTIZATION FOR DECENTRALIZED LEARNING UNDER SUBSPACE CONSTRAINTS, 30th European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 1851-1855, ISSN: 2076-1465

Conference paper

Shumovskaia V, Ntemos K, Vlaski S, Sayed AHet al., 2022, Explainability and Graph Learning From Social Interactions, IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, Vol: 8, Pages: 946-959, ISSN: 2373-776X

Author Web Link
Cite
Citations: 2

Journal article

Sayed AH, Vlaski S, 2021, Competing adaptive networks, IEEE Statistical Signal Processing Workshop (SSP) 2021, Publisher: IEEE, Pages: 71-75

Adaptive networks have the capability to pursue solutions of global stochastic optimization problems by relying only local interactions within neighborhoods. The diffusion of information through repeated interactions allows for globally optimal behavior, without the need for central coordination. Most existing strategies are developed for cooperative learning settings, where the objective of the network is common to all agents. We consider in this work a team setting, where a subset of the agents form a team with a common goal, while competing with the remainder of the network. We develop an algorithm for decentralized competition among teams of adaptive agents, analyze its dynamics and present an application in the decentralized training of generative adversarial neural networks.

Conference paper

Erginbas YE, Vlaski S, Sayed AH, 2021, Gramian-based adaptive combination policies for diffusion learning over networks, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 5215-5219

This paper presents an adaptive combination strategy for distributed learning over diffusion networks. Since learning relies on the collaborative processing of the stochastic information at the dispersed agents, the overall performance can be improved by designing combination policies that adjust the weights according to the quality of the data. Such policies are important because they would add a new degree of freedom and endow multi-agent systems with the ability to control the flow of information over their edges for enhanced performance. Most adaptive and static policies available in the literature optimize certain performance metrics related to steady-state behavior, to the detriment of transient behavior. In contrast, we develop an adaptive combination rule that aims at optimizing the transient learning performance, while maintaining the enhanced steady-state performance obtained using policies previously developed in the literature.

Conference paper

Rizk E, Vlaski S, Sayed AH, 2021, Optimal importance sampling for federated learning, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 3095-3099

Federated learning involves a mixture of centralized and decentralized processing tasks, where a server regularly selects a sample of the agents and these in turn sample their local data to compute stochastic gradients for their learning updates. The sampling of both agents and data is generally uniform; however, in this work we consider non-uniform sampling. We derive optimal importance sampling strategies for both agent and data selection and show that under convexity and Lipschitz assumptions, non-uniform sampling without replacement improves the performance of the original FedAvg algorithm. We run experiments on a regression and classification problem to illustrate the theoretical results.

Conference paper

Bordignon V, Vlaski S, Matta V, Sayed AHet al., 2021, Network classifiers based on social learning, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 5185-5189

This work proposes a new way of combining independently trained classifiers over space and time. Combination over space means that the outputs of spatially distributed classifiers are aggregated. Combination over time means that the classifiers respond to streaming data during testing and continue to improve their performance even during this phase. By doing so, the proposed architecture is able to improve prediction performance over time with unlabeled data. Inspired by social learning algorithms, which require prior knowledge of the observations distribution, we propose a Social Machine Learning (SML) paradigm that is able to exploit the imperfect models generated during the learning phase. We show that this strategy results in consistent learning with high probability, and it yields a robust structure against poorly trained classifiers. Simulations with an ensemble of feedforward neural networks are provided to illustrate the theoretical results.

Conference paper

DrStefanVlaski

Contact

Location

Summary