When parallelizing the bootstrap with futurize() by doing boot::boot(...) |> futurize(), how should I properly set the seed for reproducibility?
My past experience with the futureverse is using future.apply::future_lapply() which has an argument future.seed that can be used to control reproducibility. Not sure what the best practice is for setting the seed with futurize?
The methods I've tried below fail to properly set the seed:
futurize_options(seed = 123)
suppressPackageStartupMessages({
library(future)
library(futurize)
library(boot)
})
future::plan(future::multisession, workers = 5)
futurize_options(seed = 123)
#> $seed
#> [1] 123
#>
#> $globals
#> [1] TRUE
#>
#> $packages
#> NULL
#>
#> $stdout
#> [1] TRUE
#>
#> $conditions
#> [1] "condition"
#>
#> $scheduling
#> [1] 1
#>
#> $chunk_size
#> NULL
#>
#> attr(,"specified")
#> [1] "seed"
ratio <- function(d, w) sum(d$x * w) / sum(d$u * w)
boot(city, ratio, R = 100, stype = "w") |> futurize()
#>
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#>
#>
#> Call:
#> boot(data = city, statistic = ratio, R = 100, stype = "w", parallel = "snow",
#> ncpus = 2, cl = cl)
#>
#>
#> Bootstrap Statistics :
#> original bias std. error
#> t1* 1.520312 0.02174888 0.2072925
boot(city, ratio, R = 100, stype = "w") |> futurize()
#>
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#>
#>
#> Call:
#> boot(data = city, statistic = ratio, R = 100, stype = "w", parallel = "snow",
#> ncpus = 2, cl = cl)
#>
#>
#> Bootstrap Statistics :
#> original bias std. error
#> t1* 1.520312 0.05856933 0.2216266
futurize(seed = 123)
suppressPackageStartupMessages({
library(future)
library(futurize)
library(boot)
})
future::plan(future::multisession, workers = 5)
ratio <- function(d, w) sum(d$x * w) / sum(d$u * w)
boot(city, ratio, R = 100, stype = "w") |> futurize(seed = 123)
#>
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#>
#>
#> Call:
#> boot(data = city, statistic = ratio, R = 100, stype = "w", parallel = "snow",
#> ncpus = 2, cl = cl)
#>
#>
#> Bootstrap Statistics :
#> original bias std. error
#> t1* 1.520312 -0.002385279 0.1700565
boot(city, ratio, R = 100, stype = "w") |> futurize(seed = 123)
#>
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#>
#>
#> Call:
#> boot(data = city, statistic = ratio, R = 100, stype = "w", parallel = "snow",
#> ncpus = 2, cl = cl)
#>
#>
#> Bootstrap Statistics :
#> original bias std. error
#> t1* 1.520312 0.03643624 0.1942623
When parallelizing the bootstrap with
futurize()by doingboot::boot(...) |> futurize(), how should I properly set the seed for reproducibility?My past experience with the
futureverseis usingfuture.apply::future_lapply()which has an argumentfuture.seedthat can be used to control reproducibility. Not sure what the best practice is for setting the seed withfuturize?The methods I've tried below fail to properly set the seed:
futurize_options(seed = 123)
suppressPackageStartupMessages({ library(future) library(futurize) library(boot) }) future::plan(future::multisession, workers = 5) futurize_options(seed = 123) #> $seed #> [1] 123 #> #> $globals #> [1] TRUE #> #> $packages #> NULL #> #> $stdout #> [1] TRUE #> #> $conditions #> [1] "condition" #> #> $scheduling #> [1] 1 #> #> $chunk_size #> NULL #> #> attr(,"specified") #> [1] "seed" ratio <- function(d, w) sum(d$x * w) / sum(d$u * w) boot(city, ratio, R = 100, stype = "w") |> futurize() #> #> ORDINARY NONPARAMETRIC BOOTSTRAP #> #> #> Call: #> boot(data = city, statistic = ratio, R = 100, stype = "w", parallel = "snow", #> ncpus = 2, cl = cl) #> #> #> Bootstrap Statistics : #> original bias std. error #> t1* 1.520312 0.02174888 0.2072925 boot(city, ratio, R = 100, stype = "w") |> futurize() #> #> ORDINARY NONPARAMETRIC BOOTSTRAP #> #> #> Call: #> boot(data = city, statistic = ratio, R = 100, stype = "w", parallel = "snow", #> ncpus = 2, cl = cl) #> #> #> Bootstrap Statistics : #> original bias std. error #> t1* 1.520312 0.05856933 0.2216266futurize(seed = 123)
suppressPackageStartupMessages({ library(future) library(futurize) library(boot) }) future::plan(future::multisession, workers = 5) ratio <- function(d, w) sum(d$x * w) / sum(d$u * w) boot(city, ratio, R = 100, stype = "w") |> futurize(seed = 123) #> #> ORDINARY NONPARAMETRIC BOOTSTRAP #> #> #> Call: #> boot(data = city, statistic = ratio, R = 100, stype = "w", parallel = "snow", #> ncpus = 2, cl = cl) #> #> #> Bootstrap Statistics : #> original bias std. error #> t1* 1.520312 -0.002385279 0.1700565 boot(city, ratio, R = 100, stype = "w") |> futurize(seed = 123) #> #> ORDINARY NONPARAMETRIC BOOTSTRAP #> #> #> Call: #> boot(data = city, statistic = ratio, R = 100, stype = "w", parallel = "snow", #> ncpus = 2, cl = cl) #> #> #> Bootstrap Statistics : #> original bias std. error #> t1* 1.520312 0.03643624 0.1942623