This GIT repository contains the initial steps towards a refactored framework for claims reserving using GRUs and FFNNs. It builds upon the prior work of MLR WP members but seeks to add improved structure and SHAP explainability features to the code-base.
It is a step in that direction, but by no means the finished article. The codebase is likely to contain errors and inefficiencies, and we welcome contributions from the community to enhance it further.
There are two main scripts in this repository located in the root of the 02_code folder:
GRU_framework- 3 files built upon the GRU code created by Sarah MacDonnellNN_vs_GBM- 3 files built upon the GRU code created by Jacky Poon and Davide Ruffini
For both scripts, the structure is similar, there is:
- a close to original version of the code (e.g.
GRU_framework_orig.ipynb), - 2 variants of a refactored version with improved structure and SHAP explainability (e.g.
GRU_framework_NJC.py,GRU_framework_NJC.ipynb)
There is no intentional difference between the 2 variants, other than to provide both a script and notebook version for users' convenience.
01_data/- folder used to store data downloaded and processed by the scripts02_code/- folder containing main scripts and supporting modules02_code/configs- folder containing yaml configuration files that store parameters used by main scripts02_code/logs- folder to store excel log files generated by scripts to store artifacts from the modelling process02_code/runs- folder to store tensorboard log files generated by scripts to store artifacts from the modelling process02_code/utils- supporting code library to store re-useable functions imported into main scripts
- Training-time explanations: SHAP values can be logged to TensorBoard during training at configurable intervals
- Post-training analysis: Comprehensive SHAP explanations after model training
- Multiple visualization types: Feature importance, summary plots, partial dependency and waterfall plots
- Modular structure: Separated concerns into focused modules
- Configuration management: Centralized parameter management
- Improved readability: Steps made towards better structure and documentation
- SHAP visualizations: Feature importance and explanation plots
- Organized logging: Better structured metrics and plots
- Configurable frequency: Control how often SHAP explanations are generated
The code has been developed and tested on a linux machine using python 3.13.11 and pytorch for model training with sklearn modelling pipelines. The dependencies are listed in the requirements.txt file located in the root of the 02_code folder.
For NN training I've used CPU only, the config file can be amended to use GPU but I have been unable to comprehensively test GPU training functionality.
There are complex python library dependencies so I would recommend setting up a new virtual environment and installing the dependencies as set out in requirements.txt.
Alternatively if using uv then uv sync is set up to install the python version and dependencies.
To view the TensorBoard logs generated by the scripts, from the root directory run the following command in your terminal:
tensorboard --logdir=./02_code/runs/
Then open your web browser and navigate to http://localhost:6006/ to access the TensorBoard interface.