Historical Embedding-Guided Efficient Large-Scale Federated Graph Learning

Abstract

Graph convolutional networks (GCNs) are promising for graph learning tasks. For privacy-preserving graph learning tasks involving distributed graph datasets, federated learning (FL)-based GCN (FedGCN) training is required. An important open challenge for FedGCN is scaling to large graphs, which typically incurs 1) high computation overhead for handling the explosively-increasing number of neighbors, and 2) high communication overhead of training GCNs involving multiple FL clients. Thus, neighbor sampling is being studied to enhance the scalability of FedGCNs. Existing FedGCN training techniques with neighbor sampling often produce extremely large communication and computation overhead and inaccurate node embeddings, leading to poor model performance. To bridge this gap, we propose the Federated Adaptive Attention-based Sampling (FedAAS) approach. It achieves substantial cost savings by efficiently leveraging historical embedding estimators and focusing the limited communication resources on transmitting the most influential neighbor node embeddings across FL clients. We further design an adaptive embedding synchronization scheme to optimize the efficiency and accuracy of FedAAS on large-scale datasets. Theoretical analysis shows that the approximation error induced by the staleness of historical embedding is upper bounded, and the model is guaranteed to converge in an efficient manner. Extensive experimental evaluation against four state-of-the-art baselines on six real-world graph datasets show that FedAAS achieves up to 5.12% higher test accuracy, while saving communication and computation costs by 95.11% and 94.76%, respectively.

Publication
Association for Computing Machinery

Link: https://dl.acm.org/doi/10.1145/3654947

Luu Anh Tuan
Luu Anh Tuan
Assistant Professor

My research interests lie in the intersection of Artificial Intelligence and Natural Language Processing.