Problem
Within a single WSG, frs_break_find (gradient mode) calls fwa_slopealonginterval per blue_line_key. These are independent computations with no cross-stream dependencies, but they run as one big SQL query.
Proposed Solution
Partition the break-finding by blue_line_key ranges and run partitions in parallel. Each worker computes breaks for its subset of BLKs and appends to the shared breaks table.
Two approaches
-
PostgreSQL-side: tune max_parallel_workers_per_gather so PG parallelizes the scan internally. May already be happening. Profile with EXPLAIN ANALYZE first.
-
R-side: split blue_line_keys into N groups, run frs_break_find on each group in parallel via furrr. Requires a partition parameter on frs_break_find or a wrapper.
Risk
Low. Blue line keys are independent, no shared state during computation. The append to the breaks table is sequential (INSERT) but fast.
Priority
Lower than WSG-level Phase 1 parallelism (#64) which is simpler and has bigger impact for multi-WSG runs. This optimization matters most for single large WSGs.
Depends on #63 (local Docker) for profiling PG parallelism settings.
Problem
Within a single WSG, frs_break_find (gradient mode) calls fwa_slopealonginterval per blue_line_key. These are independent computations with no cross-stream dependencies, but they run as one big SQL query.
Proposed Solution
Partition the break-finding by blue_line_key ranges and run partitions in parallel. Each worker computes breaks for its subset of BLKs and appends to the shared breaks table.
Two approaches
PostgreSQL-side: tune max_parallel_workers_per_gather so PG parallelizes the scan internally. May already be happening. Profile with EXPLAIN ANALYZE first.
R-side: split blue_line_keys into N groups, run frs_break_find on each group in parallel via furrr. Requires a partition parameter on frs_break_find or a wrapper.
Risk
Low. Blue line keys are independent, no shared state during computation. The append to the breaks table is sequential (INSERT) but fast.
Priority
Lower than WSG-level Phase 1 parallelism (#64) which is simpler and has bigger impact for multi-WSG runs. This optimization matters most for single large WSGs.
Depends on #63 (local Docker) for profiling PG parallelism settings.