SCIMAI-Gym

Authors Information

TITLE: SCIMAI Gym
AUTHORS: Francesco Stranieri
INSTITUTION: University of Milano-Bicocca/Polytechnic University of Turin
EMAIL: francesco.stranieri@unimib.it

Requirements

To install (and import) necessary libraries, run the section:

Environment Setup

The code was tested on:

Python 3.7
Gym 0.19.0
Ray 1.5.2
Ax 0.2.1
Matplotlib 3.4.3

Supply Chain Environment

To set up the Supply Chain Environment, run the section:

Reinforcement Learning Classes

📋 To change the configuration of the Supply Chain Environment (such as the number of product types, the number of distribution warehouses, costs, or capacities), edit the sub-section:

Supply Chain Environment Class

📋 To change the global parameters (such as the seed for reproducibility, the number of episodes for the simulations, or the dir to save plots), edit (and run) the section:

Global Parameters

Then to initialize the Supply Chain Environment effectively, run the section:

Supply Chain Environment Initialization

❗️ The output of this section will have the following format (verify that the values are the same as the ones you defined):

--- SupplyChainEnvironment --- __init__
product_types_num is 1
distr_warehouses_num is 1
T is 25
d_max is [10]
d_var is [2]
sale_prices is [15]
production_costs is [5]
storage_capacities is [[ 5]
 [10]]
storage_costs is [[2]
 [1]]
transportation_costs is [[0.25]]
penalty_costs is [22.5]

Finally, to have some fundamental methods (such as the operational simulator or the plotting methods), run the section:

Methods

Baselines

To assess the DRL algorithms' performance, we established two different baselines. To initialize the Oracle and the (s, Q)-policy, run the sections:

Oracle
(s, Q)-Policy Class
(s, Q)-Policy Config [Ax]

📋 To change the (s, Q)-policy parameters (such as the total trials for the optimization or the number of episodes for each trial), edit the sub-section:

Parameters [Ax]

Finally, to have some fundamental methods (such as the methods for the Bayesian Optimization (BO) training or the plotting methods), run the section:

(s, Q)-Policy Methods [Ax]

Train BO Agents

To train the BO agents, run the section:

(s, Q)-Policy Optimize [Ax]

DRL Config

To change the DRL algorithms' parameters (such as the training episodes or the grace period for the ASHA scheduler), edit (and run) the sub-section:

Parameters [Tune]

📋 To change the DRL algorithms' hyperparameters (such as the neural network structure, the learning rate, or the batch size), edit (and run) the sub-sections:

Algorithms [Tune]
A3C Config [Tune]
PG Config [Tune]
PPO Config [Tune]

Finally, to have some fundamental methods (such as the methods for the DRL agents' training or the plotting methods), run the section:

Reinforcement Learning Methods [Tune]

Train DRL Agents

To train the DRL agents, run the section:

Reinforcement Learning Train Agents [Tune]

❗️ We upload the checkpoints of the best training instance for each approach and experiment, which can be used as a pre-trained model. For example, the checkpoint related to the Exp 1 of the 1P3W scenario for the A3C algorithm is available on /Paper_Results_1P3W/1P3W/Exp_1/1P3W_2021-09-22_15-55-24/ray_results/A3C_2021-09-22_19-56-24/A3C_SupplyChain_2a2cf_00024_24_grad_clip=20.0,lr=0.001,fcnet_hiddens=[64, 64],rollout_fragment_length=100,train_batch_size=2000_2021-09-22_22-34-50/checkpoint_000286/checkpoint-286.

Results

To output (and save) the performance (in terms of cumulative profit) and the training time (in minutes) of the DRL algorithms, run the section:

Final Results

❗️ We save the plots of the best training instance for each approach and experiment. For example, the plots related to the Exp 1 of the 1P3W scenario are available on /Paper_Results_1P3W/1P3W/Exp_1/1P3W_2021-09-22_15-55-24/plots.

The results obtained should be comparable with those in the paper. For example, for the 1P1W scenario, we achieve the following performance:

	A3C	PPO	VPG	BO	Oracle
Exp 1	870±67	1213±68	885±66	1226±71	1474±45
Exp 2	1066±94	1163±66	1100±77	1224±60	1289±68
Exp 3	−36±74	195±43	12±61	101±50	345±18
Exp 4	1317±60	1600±62	883±95	1633±39	2046±37
Exp 5	736±45	838±58	789±51	870±67	966±55

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
ECML23_AI_for_Manufacturing_Supplementary_Material.pdf		ECML23_AI_for_Manufacturing_Supplementary_Material.pdf
LICENSE		LICENSE
Paper_Results_1P1W.zip		Paper_Results_1P1W.zip
Paper_Results_1P3W.zip		Paper_Results_1P3W.zip
Paper_Results_2P2W.zip		Paper_Results_2P2W.zip
README.md		README.md
SCIMAI-Gym_V1.ipynb		SCIMAI-Gym_V1.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SCIMAI-Gym

Authors Information

Requirements

Supply Chain Environment

Baselines

Train BO Agents

DRL Config

Train DRL Agents

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SCIMAI-Gym

Authors Information

Requirements

Supply Chain Environment

Baselines

Train BO Agents

DRL Config

Train DRL Agents

Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages