Skip to content

Fix errors in edge cases for wang method#19

Open
wenh06 wants to merge 6 commits intomasterfrom
fix/fix-errors-in-edge-cases-for-wang-method
Open

Fix errors in edge cases for wang method#19
wenh06 wants to merge 6 commits intomasterfrom
fix/fix-errors-in-edge-cases-for-wang-method

Conversation

@wenh06
Copy link
Contributor

@wenh06 wenh06 commented Jan 22, 2026

Fix errors in edge cases (mainly when n_total equals ref_total) in the computation of the difference of two proportions confidence intervals using wang method.

pre-commit-ci bot and others added 4 commits December 1, 2025 16:33
updates:
- https://github.com/psf/blackhttps://github.com/psf/black-pre-commit-mirror
- [github.com/psf/black-pre-commit-mirror: 25.1.0 → 25.11.0](psf/black-pre-commit-mirror@25.1.0...25.11.0)
- [github.com/pycqa/isort: 6.0.1 → 7.0.0](PyCQA/isort@6.0.1...7.0.0)
…e computation of difference of two proportions confidence intervals using wang method
Copilot AI review requested due to automatic review settings January 22, 2026 15:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes edge case errors in the Wang method for computing confidence intervals of the difference between two proportions, particularly when n_total equals ref_total. The fix includes corrections to the step size calculation in grid refinement, changes to plateau detection logic, and updates to tests to accommodate minor numerical differences.

Changes:

  • Fixed step size calculation in _prob2step and _prob2steplmin functions to use correct denominator (grid_one - 1) instead of grid_one
  • Simplified plateau detection from tolerance-based to exact equality comparison
  • Updated tests with a two-tier tolerance system (strict 1e-4, loose 1e-2) to handle edge cases

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 11 comments.

File Description
diff_binom_confint/_specials/_wang.py Core implementation fixes: step size calculation, plateau detection, verbose parameter threading, stable sorting, and code cleanup
test/test_specials.py Added fallback test logic with dual tolerance levels, warning messages for accuracy tracking, and imports for warnings module
CHANGELOG.rst Documented the bug fix for edge cases in wang method
.devcontainer/devcontainer.json Added post-create command for dev environment setup

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +504 to +507
# mansum = sumofprob.min()
# atol = 1e-14 * (abs(mansum) if mansum != 0 else 1.0)
# plateau_idx = np.where(np.isclose(sumofprob, mansum, rtol=0.0, atol=atol))[0]
plateau_idx = np.where(sumofprob == sumofprob.min())[0]
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using exact equality (==) for floating-point comparison could miss values that are numerically very close to the minimum due to floating-point precision issues. The commented-out code used np.isclose with a small tolerance to handle this. Consider whether exact equality is appropriate here, or if the tolerance-based comparison should be retained to handle numerical precision edge cases robustly.

Suggested change
# mansum = sumofprob.min()
# atol = 1e-14 * (abs(mansum) if mansum != 0 else 1.0)
# plateau_idx = np.where(np.isclose(sumofprob, mansum, rtol=0.0, atol=atol))[0]
plateau_idx = np.where(sumofprob == sumofprob.min())[0]
mansum = sumofprob.min()
atol = 1e-14 * (abs(mansum) if mansum != 0 else 1.0)
plateau_idx = np.where(np.isclose(sumofprob, mansum, rtol=0.0, atol=atol))[0]

Copilot uses AI. Check for mistakes.
kk1 = kk

output = [val.item() if isinstance(val, np.generic) else val for val in output]
# output = [val.item() if isinstance(val, np.generic) else val for val in output]
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out code should be removed rather than left in the codebase. This appears to be the old implementation that was replaced. Leaving commented code in production reduces readability.

Suggested change
# output = [val.item() if isinstance(val, np.generic) else val for val in output]

Copilot uses AI. Check for mistakes.
if not np.isclose((r_lb, r_ub), (lb, ub), atol=ERR_LIMIT_STRICT).all():
warnings.warn(
f"Strict test failed for {n_positive = }, {n_total = }, {ref_positive = }, {ref_total = },"
f"R result: {r_lb, r_ub}, Python result: {lb, ub}. falling back to loose test.",
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after comma in the warning message. The message should read "...{ref_total = }, R result:" instead of "...{ref_total = },R result:" for better readability.

Suggested change
f"R result: {r_lb, r_ub}, Python result: {lb, ub}. falling back to loose test.",
f" R result: {r_lb, r_ub}, Python result: {lb, ub}. falling back to loose test.",

Copilot uses AI. Check for mistakes.
Comment on lines 458 to 459
f"Strict test failed for one-sided lower {n_positive = }, {n_total = }, {ref_positive = }, {ref_total = },"
f"R result: {r_lb, r_ub}, Python result: {lb, ub}. falling back to loose test.",
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after comma in the warning message. The message should read "...{ref_total = }, R result:" instead of "...{ref_total = },R result:" for better readability.

Copilot uses AI. Check for mistakes.
Comment on lines 513 to 515
lowerb = max(p0[0], p0[rightmost] - stepv) + delta
upperb = min(p0[-1], p0[leftmost] + stepv) - delta

Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a potential issue where lowerb could be greater than upperb after the calculations on lines 513-514. This happens because both lowerb and upperb are computed independently with + delta and - delta adjustments. When p0[rightmost] - stepv + delta > p0[leftmost] + stepv - delta, the second call to np.linspace(lowerb, upperb, grid_two) on line 516 would create a reversed array or fail if bounds checking is strict. Consider adding a check to ensure lowerb <= upperb, or swap them if necessary.

Suggested change
lowerb = max(p0[0], p0[rightmost] - stepv) + delta
upperb = min(p0[-1], p0[leftmost] + stepv) - delta
raw_lowerb = max(p0[0], p0[rightmost] - stepv) + delta
raw_upperb = min(p0[-1], p0[leftmost] + stepv) - delta
if raw_lowerb <= raw_upperb:
lowerb, upperb = raw_lowerb, raw_upperb
else:
# Ensure bounds are ordered for linspace; swap if necessary
lowerb, upperb = raw_upperb, raw_lowerb

Copilot uses AI. Check for mistakes.
Comment on lines +414 to +415
ERR_LIMIT_STRICT = 1e-4
ERR_LIMIT_LOOSE = 1e-2
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback to a loose tolerance of 1e-2 (ERR_LIMIT_LOOSE) represents a 100-fold increase from the strict tolerance of 1e-4 (ERR_LIMIT_STRICT). This could mask significant accuracy regressions. Consider whether this tolerance is appropriate, or if a smaller intermediate tolerance (e.g., 1e-3) would be more suitable. Additionally, consider whether tests that fall back to the loose tolerance should be logged or tracked separately to monitor accuracy degradation.

Copilot uses AI. Check for mistakes.
Comment on lines +473 to +475
# mansum = sumofprob.max()
# atol = 1e-14 * (mansum if mansum > 0 else 1.0)
# plateau_idx = np.where(np.isclose(sumofprob, mansum, rtol=0.0, atol=atol))[0]
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out code should be removed rather than left in the codebase. These lines appear to be the old implementation that was replaced with the simpler plateau detection logic on line 476. Leaving commented code in production reduces readability and can cause confusion.

Suggested change
# mansum = sumofprob.max()
# atol = 1e-14 * (mansum if mansum > 0 else 1.0)
# plateau_idx = np.where(np.isclose(sumofprob, mansum, rtol=0.0, atol=atol))[0]

Copilot uses AI. Check for mistakes.
Comment on lines +504 to +506
# mansum = sumofprob.min()
# atol = 1e-14 * (abs(mansum) if mansum != 0 else 1.0)
# plateau_idx = np.where(np.isclose(sumofprob, mansum, rtol=0.0, atol=atol))[0]
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out code should be removed rather than left in the codebase. These lines appear to be the old implementation that was replaced with the simpler plateau detection logic on line 507. Leaving commented code in production reduces readability and can cause confusion.

Suggested change
# mansum = sumofprob.min()
# atol = 1e-14 * (abs(mansum) if mansum != 0 else 1.0)
# plateau_idx = np.where(np.isclose(sumofprob, mansum, rtol=0.0, atol=atol))[0]

Copilot uses AI. Check for mistakes.
Comment on lines +475 to +476
f"Strict test failed for one-sided upper {n_positive = }, {n_total = }, {ref_positive = }, {ref_total = },"
f"R result: {r_lb, r_ub}, Python result: {lb, ub}. falling back to loose test.",
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after comma in the warning message. The message should read "...{ref_total = }, R result:" instead of "...{ref_total = },R result:" for better readability.

Copilot uses AI. Check for mistakes.
denom = (grid_one - 1) if grid_one > 1 else 1
stepv = (p0[-1] - p0[0]) / denom
lowerb = max(p0[0], p0[rightmost] - stepv) + delta
upperb = min(p0[-1], p0[leftmost] + stepv) - delta
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a potential issue where lowerb could be greater than upperb after the calculations on lines 482-483. This happens because both lowerb and upperb are computed independently with + delta and - delta adjustments. When p0[rightmost] - stepv + delta > p0[leftmost] + stepv - delta, the second call to np.linspace(lowerb, upperb, grid_two) on line 485 would create a reversed array or fail if bounds checking is strict. Consider adding a check to ensure lowerb <= upperb, or swap them if necessary.

Suggested change
upperb = min(p0[-1], p0[leftmost] + stepv) - delta
upperb = min(p0[-1], p0[leftmost] + stepv) - delta
if lowerb > upperb:
lowerb, upperb = upperb, lowerb

Copilot uses AI. Check for mistakes.
@DeepPSP DeepPSP deleted a comment from Copilot AI Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant