Impact of Graph Structure on Membership-Inference Risk for Graph Neural Networks (arxiv.org)

arXiv:2601.17130v2 Announce Type: replace
Abstract: Graph neural networks (GNNs) are widely used for tasks such as node classification and link prediction, but their use in sensitive settings raises concerns about training-data leakage. Prior work on privacy leakage in GNNs largely borrows assumptions from non-graph domains, overlooking the role of graph structure. We argue for a graph-specific analysis of privacy risk and study how graph structure affects node-level membership inference. We formalize membership inference (MI) over node-neighborhood tuples and investigate two important dimensions: (i) training-graph construction and (ii) inference-time edge access.
We compare snowball sampling, a structure-aware procedure, with uniform random node sampling for constructing training graphs.
Our experiments show that snowball sampling often hurts generalization relative to random sampling due to its coverage bias. In contrast, allowing access to inter-train-test edges at inference improves test accuracy, reduces the train-test gap, while also having a strong and setting-dependent effect on membership advantage. These results show that graph structure directly shapes privacy risk. We further show that the generalization gap, measured as the performance difference between training and test nodes, is an incomplete proxy for membership inference risk: membership advantage can rise or fall independently of changes in this gap, with inference-time edge access often playing a crucial role. Theoretically, we show that for node-level tasks, standard privacy-auditing results based on membership inference do not directly carry over to inductive graph settings, because training and test nodes are structurally dependent rather than interchangeable. We release the code and data at https://github.com/PriXAI/GraphStructurePrivacyAnalysis-public.