Skip to content

zheinz/AAMARL---AdvancedTopicsFinal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adversarial Attacks Against Multi-Agent Reinforcement Learning - AdvancedTopicsFinal — README.md

Code and artifacts for my Advanced Topics in Computer Science final project. Includes training scripts, model checkpoints, rendering, and result generation.

What’s Here

  • trainModel.py – trains the models used in the report
  • renderGame.py – loads a checkpoint and renders a game; supports applying permutation/noise “attacks”
  • results.py – recreates the graphs for the paper (update directories to match your machine)
  • Models/ – checkpoints for Models 1, 2, and 3
  • Results/ – results for baselines, noise, and permutation experiments

Usage

Train a model

python3 trainModel.py

Play a game

Running renderGame.py will play a game with the trained model by specifying the correct path to one of the models checkpoints in the source code. In this script the permutation and/or noisy attacks can be applied by uncommenting the appropriate lines.

python3 renderGame.py

Produce Result Graphs

Results.py contains the script to recreate the graphs used in the final paper however the directories in the script will need updating according to the individuals machine and where the results are saved.

python3 results.py

Explanation of Results

The models directory contain the checkpoints for models 1, 2 and 3 which were used to produce the results in the report.

The results directory contains three subdirectories:

  • baselines
  • noise
  • permutation.

Baslines contains the baselines for all three models. Noise and permutation contains the results of the noise and permutation attacks following the naming format (model, attack, noise amount/permutation probability). Therefore, M1N05 is: Model 1, Noise amount 5% and M2P90 is: Model 2, Permutation 90% etc. Within each file is the results of 100 games for each configuration where line 1 is the reward for game 1, line 2 is the number of cycles for game 1, line 3 is reward for game 2, line 4 is the number of cycles for game 2 and so forth. Therefore, the results alternate between reward and number of cycles per game for 100 games, this is how averages were calculated in the report.

About

Copy of my final system for Advanced Topics in Computer Science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages