Skip to content

Partition break-finding by blue_line_key for within-WSG parallelism #65

@NewGraphEnvironment

Description

@NewGraphEnvironment

Problem

Within a single WSG, frs_break_find (gradient mode) calls fwa_slopealonginterval per blue_line_key. These are independent computations with no cross-stream dependencies, but they run as one big SQL query.

Proposed Solution

Partition the break-finding by blue_line_key ranges and run partitions in parallel. Each worker computes breaks for its subset of BLKs and appends to the shared breaks table.

Two approaches

  1. PostgreSQL-side: tune max_parallel_workers_per_gather so PG parallelizes the scan internally. May already be happening. Profile with EXPLAIN ANALYZE first.

  2. R-side: split blue_line_keys into N groups, run frs_break_find on each group in parallel via furrr. Requires a partition parameter on frs_break_find or a wrapper.

Risk

Low. Blue line keys are independent, no shared state during computation. The append to the breaks table is sequential (INSERT) but fast.

Priority

Lower than WSG-level Phase 1 parallelism (#64) which is simpler and has bigger impact for multi-WSG runs. This optimization matters most for single large WSGs.

Depends on #63 (local Docker) for profiling PG parallelism settings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions