Add estimate_dist() with primarycensored support#1317
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughAdds a new exported function Changes
Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
52612b5 to
73be744
Compare
Co-Authored-By: sbfnk_bot Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Co-Authored-By: sbfnk_bot Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
- Move primarycensored Stan functions to inst/stan/primarycensored/ - Update estimate_dist.stan to include vendored functions - Add estimate_dist to pre-compiled stanmodels - Simplify R code to use pre-compiled model instead of runtime compilation - Fix tests to check dist_spec structure correctly Co-Authored-By: sbfnk_bot Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Co-Authored-By: sbfnk_bot Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
73be744 to
87d3308
Compare
Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
65835d3 to
e6d1979
Compare
Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
…ored-019U6v1DGjxbeXVpt1eCe7U2' into claude/estimate-dist-primarycensored-019U6v1DGjxbeXVpt1eCe7U2
Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
…ored-019U6v1DGjxbeXVpt1eCe7U2' into claude/estimate-dist-primarycensored-019U6v1DGjxbeXVpt1eCe7U2
Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
…ored-019U6v1DGjxbeXVpt1eCe7U2' into claude/estimate-dist-primarycensored-019U6v1DGjxbeXVpt1eCe7U2
Co-authored-by: sbfnk-bot <242615673+sbfnk-bot@users.noreply.github.com>
Shape and rate parameters in the gamma distribution are correlated and poorly identified individually. The mean (shape/rate) is recoverable, so the test now checks that instead.
Co-Authored-By: Sam Abbott <contact@samabbott.co.uk>
Co-Authored-By: Sam Abbott <contact@samabbott.co.uk>
Co-Authored-By: Sam Abbott <contact@samabbott.co.uk>
Use primarycensored::pcd_stan_dist_id for distribution ID lookup instead of hard-coded switch. Add default priors for normal and exp. Suppress noisy Stan warnings in tests. Co-Authored-By: Sam Abbott <contact@samabbott.co.uk>
|
Main branch estimate_dist benchmark failed (likely due to interface changes). |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
tests/testthat/test-estimate_dist.R (1)
264-274: Add an error pattern for consistency.This test lacks an error message pattern, unlike the other validation tests. Adding a pattern ensures the test catches the correct error rather than any error.
♻️ Proposed fix
expect_error( - estimate_dist(linelist, dist = "weibull") + estimate_dist(linelist, dist = "weibull"), + "Unsupported distribution" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/testthat/test-estimate_dist.R` around lines 264 - 274, The test calls estimate_dist(linelist, dist = "weibull") but uses expect_error without a pattern; update the assertion to include a regexp that matches the specific error message thrown by estimate_dist for unsupported distributions (e.g. match the phrase used in estimate_dist like "unsupported distribution" or the exact error text it emits) so the test asserts the correct error is raised rather than any error.inst/dev/benchmark-estimate-dist.R (1)
21-23: Consider making the benchmark fully reproducible.The seed list at line 23 uses
sample()without a priorset.seed(), so re-running the script produces different seeds each time. This may be intentional for stochastic benchmarking, but if reproducibility across runs is desired, move line 26'sset.seed(12345)before line 23.♻️ Proposed fix for full reproducibility
## Configuration +set.seed(12345) n_iter <- 5 seeds <- sample(.Machine$integer.max, n_iter) ## Simulate censored delay data -set.seed(12345) n_obs <- 400🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@inst/dev/benchmark-estimate-dist.R` around lines 21 - 23, The script generates stochastic seeds with seeds <- sample(.Machine$integer.max, n_iter) but does not set a RNG seed first; to make the benchmark fully reproducible, call set.seed(12345) (or another chosen constant) before computing seeds (i.e., move the existing set.seed(12345) so it runs prior to the seeds <- sample(...) line) so n_iter and seeds are deterministically generated across runs.inst/stan/estimate_dist.stan (1)
29-30: Update the dist_id comment to reflect all supported distributions.The comment states "1=lognormal, 2=gamma" but the R code also supports normal and exponential distributions via
primarycensored::pcd_stan_dist_id(). Consider updating the comment to match the actual supported set or simply reference the primarycensored documentation.📝 Proposed comment update
- // Distribution ID: 1=lognormal, 2=gamma + // Distribution ID: see primarycensored::pcd_stan_dist_id() + // Supported: lognormal, gamma, normal, exp int<lower=1> dist_id;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@inst/stan/estimate_dist.stan` around lines 29 - 30, The comment for the dist_id variable declaration in the Stan file does not document all supported distributions. Update the comment above the int<lower=1> dist_id parameter to include all four supported distribution types (lognormal, gamma, normal, and exponential) with their corresponding ID values, or alternatively reference the primarycensored::pcd_stan_dist_id() documentation to keep the comment concise and maintainable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/render-estimate-dist.yaml:
- Around line 10-12: Concurrency group currently uses github.head_ref which is
empty for non-pull_request events; update the concurrency.group expression to
fallback to github.ref when github.head_ref is empty so non-PR triggers have
distinct groups (use the GitHub Actions expression fallback pattern with
github.head_ref || github.ref), keeping cancel-in-progress behavior unchanged.
In `@R/estimate_dist.R`:
- Around line 269-280: The switch handling for init_params in estimate_dist.R
only sets "lognormal" and "gamma", leaving init_params NULL for "normal" and
"exp" and causing pmax/pmin to fail; update the switch(dist, ...) to include MoM
initialisations: for "normal" set init_params to c(mean = wmean, sd =
sqrt(wvar)), and for "exp" set init_params to the exponential parameter using
MoM (rate = 1 / wmean or the parameterization your model uses) so init_params
matches the parameter ordering expected by stan_data; keep the existing clamping
to stan_data$params_lower and stan_data$params_upper and return init_fn <-
function() list(params = init_params) as before.
---
Nitpick comments:
In `@inst/dev/benchmark-estimate-dist.R`:
- Around line 21-23: The script generates stochastic seeds with seeds <-
sample(.Machine$integer.max, n_iter) but does not set a RNG seed first; to make
the benchmark fully reproducible, call set.seed(12345) (or another chosen
constant) before computing seeds (i.e., move the existing set.seed(12345) so it
runs prior to the seeds <- sample(...) line) so n_iter and seeds are
deterministically generated across runs.
In `@inst/stan/estimate_dist.stan`:
- Around line 29-30: The comment for the dist_id variable declaration in the
Stan file does not document all supported distributions. Update the comment
above the int<lower=1> dist_id parameter to include all four supported
distribution types (lognormal, gamma, normal, and exponential) with their
corresponding ID values, or alternatively reference the
primarycensored::pcd_stan_dist_id() documentation to keep the comment concise
and maintainable.
In `@tests/testthat/test-estimate_dist.R`:
- Around line 264-274: The test calls estimate_dist(linelist, dist = "weibull")
but uses expect_error without a pattern; update the assertion to include a
regexp that matches the specific error message thrown by estimate_dist for
unsupported distributions (e.g. match the phrase used in estimate_dist like
"unsupported distribution" or the exact error text it emits) so the test asserts
the correct error is raised rather than any error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 00c3eff2-a91f-40cc-9765-00cc3f09c910
⛔ Files ignored due to path filters (4)
vignettes/estimate-dist-plot-delays-1.pngis excluded by!**/*.pngvignettes/estimate-dist-prior-predictive-1.pngis excluded by!**/*.pngvignettes/estimate-dist-recovery-density-1.pngis excluded by!**/*.pngvignettes/estimate-dist-recovery-params-1.pngis excluded by!**/*.png
📒 Files selected for processing (25)
.github/workflows/check-primarycensored.yaml.github/workflows/render-estimate-dist.yaml.github/workflows/stan-model-benchmark.yamlNAMESPACENEWS.mdR/dist_spec.RR/estimate_delay.RR/estimate_dist.RR/get.RR/summarise.R_pkgdown.ymlinst/dev/benchmark-estimate-dist.Rinst/dev/vendor-primarycensored.Rinst/stan/estimate_dist.stanman/dot-get_dist_id.Rdman/dot-get_param_names.Rdman/dot-prepare_linelist_data.Rdman/estimate_delay.Rdman/estimate_dist.Rdman/get_parameters.Rdman/summary.estimate_dist.Rdtests/testthat/test-estimate_dist.Rvignettes/estimate-dist.Rmdvignettes/estimate-dist.Rmd.origvignettes/library.bib
✅ Files skipped from review due to trivial changes (10)
- man/dot-get_dist_id.Rd
- man/get_parameters.Rd
- man/dot-get_param_names.Rd
- vignettes/library.bib
- man/summary.estimate_dist.Rd
- man/dot-prepare_linelist_data.Rd
- man/estimate_delay.Rd
- vignettes/estimate-dist.Rmd.orig
- vignettes/estimate-dist.Rmd
- man/estimate_dist.Rd
🚧 Files skipped from review as they are similar to previous changes (3)
- R/estimate_delay.R
- NAMESPACE
- _pkgdown.yml
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 80a53a3 is merged into main:
|
Co-authored-by: Sam Abbott <contact@samabbott.co.uk> Co-authored-by: epiforecasts-workflows[bot] <epiforecasts-workflows[bot]@users.noreply.github.com>
estimate_infectionsTable: Benchmarking results (mean time in seconds). |operation | branch| main| % change| range| trend| |:--------------------|-------:|-------:|--------:|----------:|---------:| |infections | 4.3| 4.4| -1| (-29, 38)| no change| |reports | 4.2| 4.3| -1| (-27, 36)| no change| |delays | 11| 3.5| 207| (122, 353)| slowdown| |report lp | 2.9| 2.9| 0| (-27, 39)| no change| |R0 | 1.3| 1.3| 0| (-28, 36)| no change| |update gp | 1| 1| 0| (-26, 41)| no change| |gt | 0.46| 0.46| 1| (-25, 37)| no change| |day of the week | 0.43| 0.43| 2| (-26, 41)| no change| |truncation | 0.29| 0.29| 1| (-28, 38)| no change| |param lp | 0.24| 0.25| -2| (-28, 37)| no change| |truncate | 0.21| 0.21| 2| (-23, 41)| no change| |delays lp | 0.1| 0.11| -2| (-26, 39)| no change| |rt lp | 0.098| 0.099| 0| (-29, 38)| no change| |gp lp | 0.094| 0.095| 0| (-31, 37)| no change| |generated quantities | 0.0082| 0.0078| 5| (-4, 19)| no change| |assign max | 5.4e-07| 5.8e-07| -6| (-43, 66)| no change| Main branch estimate_dist benchmark failed (likely due to interface changes). |
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 4b82fcf is merged into main:
|
Co-authored-by: github-merge-queue <118344674+github-merge-queue@users.noreply.github.com> Co-authored-by: Sam Abbott <contact@samabbott.co.uk> Co-authored-by: sbfnk <sebastian.funk@lshtm.ac.uk> Co-authored-by: epiforecasts-workflows[bot] <243357408+epiforecasts-workflows[bot]@users.noreply.github.com> Co-authored-by: Sebastian Funk - robot edition <242615673+sbfnk-bot@users.noreply.github.com>
estimate_infectionsTable: Benchmarking results (mean time in seconds). |operation | branch| main| % change| range| trend| |:--------------------|------:|-------:|--------:|----------:|---------:| |infections | 4.1| 4| 7| (-38, 94)| no change| |reports | 4| 3.9| 6| (-38, 87)| no change| |delays | 10| 3.2| 223| (89, 479)| slowdown| |report lp | 3| 2.8| 8| (-37, 93)| no change| |R0 | 1.2| 1.2| 5| (-38, 88)| no change| |update gp | 1| 0.97| 8| (-36, 87)| no change| |day of the week | 0.45| 0.44| 5| (-38, 89)| no change| |gt | 0.46| 0.43| 11| (-37, 106)| no change| |truncation | 0.29| 0.28| 8| (-38, 95)| no change| |param lp | 0.27| 0.25| 12| (-36, 101)| no change| |truncate | 0.21| 0.2| 8| (-39, 88)| no change| |delays lp | 0.13| 0.12| 16| (-32, 115)| no change| |rt lp | 0.1| 0.097| 11| (-39, 90)| no change| |gp lp | 0.088| 0.087| 4| (-42, 85)| no change| |generated quantities | 0.0073| 0.0072| 2| (-7, 9)| no change| |assign max | 6e-07| 6.2e-07| -2| (-45, 45)| no change| Main branch estimate_dist benchmark failed (likely due to interface changes). |
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 300857b is merged into main:
|
Co-authored-by: epiforecasts-workflows[bot] <epiforecasts-workflows[bot]@users.noreply.github.com>
estimate_infectionsTable: Benchmarking results (mean time in seconds). |operation | branch| main| % change| range| trend| |:--------------------|-------:|-------:|--------:|----------:|---------:| |infections | 4.4| 4.1| 8| (-21, 41)| no change| |reports | 4.3| 4.1| 6| (-21, 36)| no change| |delays | 11| 3.4| 230| (143, 322)| slowdown| |report lp | 3.3| 3| 11| (-19, 58)| no change| |R0 | 1.4| 1.3| 8| (-18, 40)| no change| |update gp | 1.2| 1| 14| (-14, 51)| no change| |gt | 0.51| 0.47| 12| (-16, 47)| no change| |day of the week | 0.47| 0.43| 11| (-17, 61)| no change| |truncation | 0.31| 0.29| 8| (-20, 39)| no change| |param lp | 0.29| 0.27| 11| (-17, 43)| no change| |truncate | 0.24| 0.21| 18| (-8, 54)| no change| |delays lp | 0.12| 0.11| 8| (-19, 44)| no change| |rt lp | 0.12| 0.11| 13| (-20, 67)| no change| |gp lp | 0.11| 0.1| 5| (-26, 39)| no change| |generated quantities | 0.0076| 0.007| 9| (-8, 23)| no change| |assign max | 5.2e-07| 4.9e-07| 10| (-46, 103)| no change| Main branch estimate_dist benchmark failed (likely due to interface changes). |
Update since last review passBringing reviewers up to speed on what changed when #1356 merged in (plus subsequent lint cleanup on this branch). Distribution constructors &
Verbose gating
Vignette / docs
Tests
Workflows / dev scripts
NEWS
Lint
CI as of This is an automated comment summarising bot-merged work; please ping @seabbs for any questions. |
estimate_infectionsTable: Benchmarking results (mean time in seconds). |operation | branch| main| % change| range| trend| |:--------------------|-------:|-------:|--------:|---------:|---------:| |infections | 4.1| 4.3| -2| (-53, 34)| no change| |reports | 4| 4.2| -1| (-52, 36)| no change| |delays | 10| 3.5| 205| (49, 321)| slowdown| |report lp | 3.1| 3.1| 3| (-50, 42)| no change| |R0 | 1.3| 1.4| -1| (-50, 40)| no change| |update gp | 1.1| 1.1| 3| (-50, 41)| no change| |gt | 0.47| 0.48| 2| (-50, 42)| no change| |day of the week | 0.44| 0.45| 1| (-50, 41)| no change| |truncation | 0.29| 0.3| -1| (-52, 37)| no change| |param lp | 0.26| 0.28| -2| (-52, 35)| no change| |truncate | 0.21| 0.22| 0| (-53, 37)| no change| |delays lp | 0.12| 0.13| -1| (-53, 40)| no change| |rt lp | 0.12| 0.12| 4| (-52, 50)| no change| |gp lp | 0.099| 0.11| -2| (-51, 55)| no change| |generated quantities | 0.007| 0.0067| 5| (0, 10)| no change| |assign max | 4.7e-07| 4.6e-07| 5| (-44, 78)| no change| Main branch estimate_dist benchmark failed (likely due to interface changes). |
…DGjxbeXVpt1eCe7U2
|
Main branch estimate_dist benchmark failed (likely due to interface changes). |
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 37ade58 is merged into main:
|
Description
This PR closes #350.
Adds
estimate_dist()for fitting delay distributions using primarycensored's interval-censored likelihood. Replaces the deprecatedestimate_delay().Changes:
estimate_dist()function supporting lognormal, gamma, and weibull distributionsinst/stan/primarycensored/for pre-compilationestimate_delay()with a warning pointing to the new functionRelated: #1268 (vendored Stan files enable future
discretised_pmfwork, not included here).Initial submission checklist
devtools::test()anddevtools::check()).devtools::document()).lintr::lint_package()).After the initial Pull Request
Summary by CodeRabbit
Release Notes
New Features
estimate_dist()function for fitting epidemiological delay distributions using MCMC-based Bayesian inference with proper handling of interval censoring and truncation.Deprecations
estimate_delay()in favour ofestimate_dist(), which provides improved support for censoring and truncation scenarios.Documentation