KGATE is a Knowledge Graph Autoencoder Training Environment designed to make knowledge graph embedding and mode prototyping easy.
This repository is a benchmark to compare the performance and duration of KGATE compared to other libraries, namely PyTorch Geometric, TorchKGE and PyKEEN. As of making this benchmark, Pytorch Geometric and PyKEEN are the only known library still maintained, but TorchKGE is still included since KGATE is mostly built upon this library.
To run the benchmark, clone the repository and install the dependencies in a virtual environment.
git clone https://github.com/BAUDOTlab/KGATE_benchmark
cd KGATE_benchmarkpython -m venv benchmark_venv
source benchmark_venv/bin/activate
pip install -r requirement.txtCall the script benchmark.py with the following possible arguments:
--dataset, -d DATASETChoose a dataset betweenFB15k-237(default) andWN18RR--model, -m MODELChoose a decoder model betweenTransE(default),DistMultandComplEx--preload, -pUse saved knowledge graph object for KGATE and TorchKGE, significantly reducing data loading time. No effect on Pytorch Geometric and PyKEEN.--no-clean, -ncDon't run the data cleaning procedure of KGATE.
Other hyperparameters can be changed in the init of the interface.Benchmark class. Their default values are:
- Embedding dimensions: 256
- Margin: 0.5
- Negative sampler: Bernoulli
- Number of negative samples per positive : 5
- Training batch size: 4096
- Epochs: 1000
- Learning rate : 0.001
- Evaluation interval: 50 epochs
Examples to run the benchmark:
# Dataset: FB15k-237, decoder: TransE
python benchmark.py
# Dataset: WN18RR, decoder: ComplEx
python benchmark.py -d WN18RR -m ComplExThis work is licensed under the MIT License.
This repository is a reproducible benchmark to compare KGATE to other currently maintained KGE libraries and does not expect any new contributions, but it is not exhaustive and additionnal libraries may be added for a fuller comparison.
Coming soon.