GAB is a benchmarking framework for evaluating adversarial attacks and defenses on Graph Neural Networks (GNNs) under standardized and rigorous experimental settings.
Our goal is to eliminate inconsistencies in prior evaluations by providing a unified framework that enables fair comparison across methods. We systematically re-evaluate widely used attacks and defenses under both poisoning and evasion scenarios across multiple graph datasets, revealing critical factors that significantly impact reported performance.
- 🧪 Unified benchmark for adversarial GNN evaluation across diverse settings
- 📊 Comprehensive study covering attacks, defenses, and multiple datasets
- 🔍 Reveals hidden factors (e.g., node selection, training procedures) affecting robustness
- ⚖️ Standardized protocols for fair and reproducible comparisons
- 🚀 Scalable experimental pipeline (400K+ runs) for robust analysis
To setup the environment, please check out setup.md
To run evaluation on adversarial attacks and defense. An example command can be used:
python experiments/adversarial_attack_evaluation.py --model=GCN --adversarial=l1d_rnd_attack
num_run: define number of runs with different seednum_split: number of random split per dataset for evaluationpurification: select purification methods from list of purifications.config_setting: Select eitherbest_configordefault.best_configuse the best configuration of victim model selected from model selection.model: name of victim model selected from list of models.adversarial: name of adversarial attack selected from list of adversarial attacks.evasionorno-evasion: to perform evaluation in evasion setting (Trueorevasionby default)poisonorno-poison: to perform evaluation in poison setting (Trueorpoisonby default)use-node-degreeorno-use-node-degree: to perform evaluation with node degree as an extra criteria to select target node (Trueoruse-node-degreeby default)dataset: dataset to perform experience on, select from list of supporting datasets.
To perform model selection, use hyper_tuning_all_split.py to perform hyperparameters search on GNN backbone. An example command as follows:
python experiments/hyper_tuning_all_split.py --model=GCN --dataset=coradataset: dataset to perform experience onmodel: name of victim model selected from list of models.
Or model selection on purification as follows:
python experiments/hyper_tuning_filter_all_split.py --model=GCN --dataset=cora --purification=GARNETdataset: dataset to perform experience onmodel: name of victim model selected from list of models.purification: purification method from list of models.
| Method | Paper |
|---|---|
| GARNET | GARNET: reduced-rank topology learning for robust and scalable graph neural networks |
| Jaccard | Adversarial Examples on Graph Data: Deep Insights into Attack and Defense |
| Dataset Name |
|---|
| citeseer |
| citeseer_full |
| cora |
| cora_ml |
| cora_full |
| amazon_cs |
| amazon_photo |
| coauthor_cs |
| coauthor_phy |
| polblogs |
| karate_club |
| pubmed |
| flickr |
| blogcatalog |
| dblp |
| acm |
| uai |
| pdn |
| Roman-empire |
| Amazon-ratings |
| Minesweeper |
| Tolokers |
| Questions |
| chameleon |
| crocodile |
| squirrel |
and OGB dataset.
GOttack requires precompute orbit discovery on the dataset, PyORCA is available in utility. Please check out PyORCA Github for setting up instruction.
Our adversarial benchmark is built upon GreatX and DeepRobust. We appreciate their contribution to graph adversarial learning.