Skip to content
View arthurpmotta02's full-sized avatar

Block or report arthurpmotta02

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
arthurpmotta02/README.md

Arthur Pontes Motta

Actuarial Science & Statistics — UFRJ

LinkedIn GitHub CNPq

B.Sc. Actuarial Science & Statistics · UFRJ · 2021–2026

Research: Stochastic Processes with Renewal · IM/UFRJ/CNPq · 2024–2026

Actuarial pricing · Credibility theory · Survival analysis · Time series & state space models · Bayesian inference · Reinsurance analytics · Pension fund valuation · Loss reserving · Extreme Value Theory · ALM


Stack

Actuarial & Statistical Modelling

R Python Stan StMoMo lifecontingencies insurancerating ChainLadder survival flexsurv forecast dlm KFAS tidyverse ggplot2

Machine Learning & Data Science

pandas numpy scikit-learn XGBoost SHAP GLM EVT Cox PH SARIMA DLM

Deploy & Visualization

Streamlit Plotly Dash Quarto Power BI Docker ggplot2

Infrastructure

Git GitHub Jupyter VSCode PostgreSQL


Projects

Mortality forecasting for a Brazilian EFPC using Bühlmann-Straub credibility and full Bayesian inference. Individual age-level data (116 ages, 417k person-years, 3,658 deaths, 2012–2014). PELT-Poisson changepoint detection for objective risk segmentation (6 breakpoints, largest regime shift 7.06×). Credibility factors ω̂ᵢ > 0.999 across all groups. BS outperforms AT-2000, BR-EMS and Pub-2010 in 3 of 4 age bands. Poisson-Gamma Stan model (4 chains × 30k iter, R̂ < 1.001): full predictive distribution with 95% CI [1,144; 1,322] for 2014 vs. 1,335 observed. Interactive Quarto report published on GitHub Pages.

Report PDF

R Stan Quarto Bühlmann-Straub PREVIC


Complete survival analysis of the flchain cohort (7,871 individuals, 2,166 deaths, 14.3 yr follow-up) investigating the association between serum FLC and all-cause mortality. Kaplan-Meier and Nelson-Aalen estimators by FLC group, sex and age band. Cox proportional hazards model selected via Collett 4-step procedure + stepAIC + sequential LRT: HR 2.04 (95% CI 1.73–2.40) for high vs. low FLC after adjustment for age, sex, creatinine and MGUS; Harrell's C = 0.788. Parametric AFT comparison across 6 distributions — generalized gamma selected (ΔAIC > 50 over all alternatives, Q̂ ≈ 1.57, 95% CI 1.38–1.75), reducing expected survival time by ~42% in the high FLC group. Five Cox diagnostics (Schoenfeld, Martingale, dfbetas, Deviance, C-statistic). Interactive Quarto report published on GitHub Pages.

Report

R survival flexsurv Quarto Cox AFT


Full SARIMA analysis of the Keeling Curve (468 monthly observations, 1959–1997). STL decomposition, ADF/KPSS stationarity tests, ACF/PACF identification. Seven candidate models compared by AIC, AICc and BIC — SARIMA(1,1,1)(0,1,1)₁₂ selected by parsimony (ΔAICc < 0.3 vs. nearest competitor). All diagnostics passed: Ljung-Box p = 0.41 (h = 48), Shapiro-Wilk p = 0.53, Jarque-Bera p = 0.38. 24-month forecast for 1998–1999 with 95% CI width growing from ±0.5 to ±2.0 ppm (<0.6% relative error). Regression + ARMA(1,1) alternative benchmarked (ΔAIC = 104). Extended in Part 2 with Dynamic Linear Models. Interactive Quarto report published on GitHub Pages.

Report

R forecast Quarto SARIMA STL


Bayesian state-space analysis of the Keeling Curve using Dynamic Linear Models (same dataset as Part 1). Three DLM formulations compared (dlmModSeas, dlmModTrig, KFAS) — Model B (dlmModTrig, J = 6 Fourier harmonics, 13 states) selected by log-likelihood (205.42) and as the only model satisfying both white-noise (Ljung-Box p = 0.21, h = 12) and normality (Shapiro-Wilk p = 0.56) assumptions on innovations. Kalman filter and backward smoother recover the latent level μₜ and growth rate β̂ₜ, revealing acceleration from ~0.8 to ~1.5 ppm/yr (1960–1997) — inaccessible to SARIMA. Discount factor approach (δ_T = 0.95, δ_S = 0.98) implemented from scratch for unknown V. 24-month forecasts align with SARIMA within 0.5 ppm across all horizons. Interactive Quarto report published on GitHub Pages.

Report

R dlm KFAS Quarto DLM Discount


Actuarial data quality pipeline for Brazilian EFPC pension funds (PREVIC Resolution 7/2022 and CPA 017/2019 IBA). Automates 19 regulatory validations across 3 participant populations (active, beneficiaries, deferred), classifying issues as CRITICAL or ALERT. Processes 930-participant base in ~2 seconds vs hours of manual Excel work. Outputs: formatted actuarial Excel report, Plotly Dash dashboard (4 pages, dark theme, Docker deploy), and a full Power BI PBIP/PBIR project generated entirely by code (47 JSON files via TMDL).

Python Plotly Dash Power BI Docker PREVIC


End-to-end pricing pipeline for auto insurance using real Brazilian market data (SUSEP AUTOSEG 2019–2021). Collision and theft coverages modelled separately with GLM Poisson (frequency) and GLM Gamma (severity), benchmarked against XGBoost Tweedie (Gini = 0.241 collision, 0.402 theft). SHAP explainability. Interactive Streamlit deploy.

Python GLM XGBoost SHAP SUSEP Streamlit


Reinsurance analytics pipeline on French Motor TPL data (freMTPL2, 678k policies). Extreme Value Theory (GPD, Hill estimator) for tail modelling; treaty structuring across Quota Share, XL and Aggregate Stop Loss; differential evolution optimization achieving 20.1% capital relief on VaR 99.5% annual aggregate. Streamlit dashboard with 5 interactive pages.

Python EVT GPD Reinsurance Streamlit


Full actuarial valuation of a Brazilian Defined Benefit plan. Lee-Carter mortality projection to 2065 via StMoMo; Projected Unit Credit method via lifecontingencies — PMBaC R$16.7M, PMBC R$353.8M. Longevity sensitivity: +1 yr of life expectancy = +0.7% liability. ALM: liability duration 18.5 yr vs NTN-B portfolio 9.3 yr. Interest rate stress +-200bp. Streamlit dashboard.

R Python StMoMo lifecontingencies ALM Streamlit


GitHub Stats

Top Langs

Pinned Loading

  1. credibilidade-mortalidade-efpc credibilidade-mortalidade-efpc Public

    Previsão de mortalidade em EFPC via Bühlmann-Straub e inferência Poisson-Gama (Stan) — Teoria da Credibilidade UFRJ 2026/1

    HTML

  2. pension-fund-actuarial-analysis pension-fund-actuarial-analysis Public

    Actuarial valuation of a Brazilian BD pension plan: Lee-Carter (StMoMo), lifecontingencies, ALM and Streamlit dashboard

    Jupyter Notebook

  3. co2-mauna-loa-dlm co2-mauna-loa-dlm Public

    CO₂ atmospheric concentration at Mauna Loa (1959–1997) modelled with Dynamic Linear Models — Kalman filter, backward smoothing, discount factors and 24-month forecasts. Comparison with SARIMA from …

    HTML

  4. co2-mauna-loa-sarima co2-mauna-loa-sarima Public

    Análise SARIMA da série de CO₂ atmosférico de Mauna Loa (1959–1997): identificação, ajuste, diagnóstico de resíduos e previsão para 1998–1999.

    HTML

  5. cadastral-actuarial-pipeline cadastral-actuarial-pipeline Public

    Pipeline Python que automatiza a crítica cadastral de fundos de pensão (EFPC) conforme Res. PREVIC 7/2022 e CPA 017/2019 IBA. 19 verificações regulatórias, relatório atuarial Excel, dashboard Plotl…

    Python

  6. market-risk-dashboard market-risk-dashboard Public

    Python