Skip to content

⚡️ Speed up function _parse_modifiers by 41%#169

Open
codeflash-ai[bot] wants to merge 1 commit into
branch-3.9from
codeflash/optimize-_parse_modifiers-mhx0ngch
Open

⚡️ Speed up function _parse_modifiers by 41%#169
codeflash-ai[bot] wants to merge 1 commit into
branch-3.9from
codeflash/optimize-_parse_modifiers-mhx0ngch

Conversation

@codeflash-ai
Copy link
Copy Markdown

@codeflash-ai codeflash-ai Bot commented Nov 13, 2025

📄 41% (0.41x) speedup for _parse_modifiers in src/bokeh/models/tools.py

⏱️ Runtime : 597 microseconds 423 microseconds (best of 250 runs)

📝 Explanation and details

The optimized code achieves a 41% speedup by eliminating the memory-intensive list comprehension and moving the strip() operation inside the loop.

Key optimization:

  • Removed list comprehension: Changed keys = [key.strip() for key in value.split("+")] to keys = value.split("+") with key = key.strip() inside the loop
  • Memory efficiency: Avoids creating an intermediate list of stripped strings, reducing memory allocation overhead

Why this is faster:

  1. Reduced memory allocations: The original creates two lists (split result + comprehension result), while the optimized version only creates one
  2. Lazy processing: Keys are stripped only as they're processed, rather than all upfront
  3. Better cache locality: Processing one key at a time keeps data access patterns more predictable

Performance characteristics from tests:

  • Single modifiers: 45-60% faster (best case scenario)
  • Multiple modifiers: 26-43% faster
  • Invalid inputs: Up to 304% faster due to early failure before processing all keys
  • Large inputs: 13-33% faster, with the most dramatic improvements on invalid large inputs (223-304% faster)

The optimization is particularly effective for error cases and small inputs (the common use case for key modifier parsing), where the overhead of list comprehension is most pronounced relative to the actual work being done.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 20 Passed
🌀 Generated Regression Tests 50 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 3 Passed
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
unit/bokeh/models/test_tools.py::test__parse_modifiers 11.2μs 8.90μs 26.0%✅
🌀 Generated Regression Tests and Runtime
from enum import Enum

# imports
import pytest
from bokeh.models.tools import _parse_modifiers


# Simulate bokeh.core.enums.KeyModifierType for testing
class KeyModifierType(str, Enum):
    alt = "alt"
    ctrl = "ctrl"
    shift = "shift"
from bokeh.models.tools import _parse_modifiers

# unit tests

# -------------------------
# Basic Test Cases
# -------------------------

def test_single_modifier_alt():
    # Test parsing a single valid modifier 'alt'
    codeflash_output = _parse_modifiers("alt"); result = codeflash_output # 1.37μs -> 942ns (45.3% faster)

def test_single_modifier_ctrl():
    # Test parsing a single valid modifier 'ctrl'
    codeflash_output = _parse_modifiers("ctrl"); result = codeflash_output # 1.25μs -> 857ns (46.0% faster)

def test_single_modifier_shift():
    # Test parsing a single valid modifier 'shift'
    codeflash_output = _parse_modifiers("shift"); result = codeflash_output # 1.34μs -> 900ns (49.0% faster)

def test_multiple_modifiers_ordered():
    # Test parsing multiple modifiers in order
    codeflash_output = _parse_modifiers("alt+ctrl+shift"); result = codeflash_output # 1.96μs -> 1.45μs (35.6% faster)

def test_multiple_modifiers_unordered():
    # Test parsing multiple modifiers in different order
    codeflash_output = _parse_modifiers("shift+alt+ctrl"); result = codeflash_output # 1.89μs -> 1.33μs (42.2% faster)

def test_multiple_modifiers_with_spaces():
    # Test parsing modifiers with extra spaces
    codeflash_output = _parse_modifiers(" alt + ctrl + shift "); result = codeflash_output # 2.02μs -> 1.57μs (28.6% faster)

def test_duplicate_modifiers():
    # Test parsing duplicate modifiers (should not error, should be idempotent)
    codeflash_output = _parse_modifiers("alt+alt+ctrl"); result = codeflash_output # 1.80μs -> 1.33μs (35.1% faster)

# -------------------------
# Edge Test Cases
# -------------------------

def test_empty_string():
    # Test parsing an empty string (should raise ValueError)
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers("") # 1.97μs -> 1.46μs (35.3% faster)

def test_unknown_modifier():
    # Test parsing an unknown modifier (should raise ValueError)
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers("meta") # 1.90μs -> 1.43μs (32.7% faster)

def test_mixed_known_and_unknown_modifiers():
    # Test parsing a mix of known and unknown modifiers (should raise ValueError)
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers("ctrl+foo+shift") # 2.26μs -> 1.69μs (33.9% faster)

def test_leading_trailing_plus():
    # Test parsing with leading and trailing pluses (should raise ValueError for empty key)
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers("+alt+") # 2.06μs -> 1.45μs (42.4% faster)

def test_only_plus_signs():
    # Test parsing only plus signs (should raise ValueError for empty key)
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers("++") # 1.99μs -> 1.45μs (36.7% faster)

def test_spaces_only():
    # Test parsing a string with only spaces (should raise ValueError)
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers("   ") # 1.91μs -> 1.46μs (31.3% faster)

def test_case_sensitivity():
    # Test that the function is case-sensitive and does not accept 'Alt' or 'CTRL'
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers("Alt") # 1.88μs -> 1.40μs (34.4% faster)

    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers("CTRL") # 1.14μs -> 905ns (25.5% faster)

def test_modifier_with_internal_spaces():
    # Test that 'c trl' is not accepted
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers("c trl") # 1.73μs -> 1.27μs (35.5% faster)

def test_modifier_with_tab_characters():
    # Test that tab characters are stripped and handled
    codeflash_output = _parse_modifiers("\talt\t+\tctrl\t"); result = codeflash_output # 1.85μs -> 1.33μs (38.7% faster)

def test_modifier_with_newline_characters():
    # Test that newline characters are stripped and handled
    codeflash_output = _parse_modifiers("\nalt\n+\nctrl\n"); result = codeflash_output # 1.72μs -> 1.28μs (34.0% faster)

# -------------------------
# Large Scale Test Cases
# -------------------------

def test_large_number_of_modifiers():
    # Test a large input with 333 'alt', 333 'ctrl', 334 'shift'
    modifiers_list = (["alt"] * 333) + (["ctrl"] * 333) + (["shift"] * 334)
    input_str = "+".join(modifiers_list)
    codeflash_output = _parse_modifiers(input_str); result = codeflash_output # 77.6μs -> 68.7μs (13.1% faster)

def test_large_input_with_spaces_and_duplicates():
    # Test a large input with spaces and repeated modifiers
    modifiers_list = ([" alt "] * 500) + ([" ctrl "] * 250) + ([" shift "] * 250)
    input_str = "+".join(modifiers_list)
    codeflash_output = _parse_modifiers(input_str); result = codeflash_output # 86.0μs -> 70.1μs (22.8% faster)

def test_large_input_with_invalid_modifier():
    # Test a large input where one invalid modifier is present
    modifiers_list = (["alt"] * 499) + ["badmod"] + (["ctrl"] * 250) + (["shift"] * 250)
    input_str = "+".join(modifiers_list)
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers(input_str) # 49.8μs -> 33.0μs (51.0% faster)

def test_large_input_all_invalid():
    # Test a large input with all invalid modifiers
    modifiers_list = ["foo"] * 999
    input_str = "+".join(modifiers_list)
    with pytest.raises(ValueError) as excinfo:
        _parse_modifiers(input_str) # 34.1μs -> 10.5μs (223% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from enum import Enum

# imports
import pytest
from bokeh.models.tools import _parse_modifiers


# Simulate bokeh.core.enums.KeyModifierType for testing
class KeyModifierType(str, Enum):
    alt = "alt"
    ctrl = "ctrl"
    shift = "shift"
from bokeh.models.tools import _parse_modifiers

# unit tests

# ---------------------------
# Basic Test Cases
# ---------------------------

def test_single_modifier_alt():
    # Test parsing a single valid modifier 'alt'
    codeflash_output = _parse_modifiers("alt") # 1.38μs -> 861ns (60.2% faster)

def test_single_modifier_ctrl():
    # Test parsing a single valid modifier 'ctrl'
    codeflash_output = _parse_modifiers("ctrl") # 1.32μs -> 901ns (46.6% faster)

def test_single_modifier_shift():
    # Test parsing a single valid modifier 'shift'
    codeflash_output = _parse_modifiers("shift") # 1.35μs -> 927ns (45.8% faster)

def test_multiple_modifiers_order1():
    # Test parsing multiple valid modifiers in order
    codeflash_output = _parse_modifiers("alt+ctrl") # 1.63μs -> 1.21μs (34.7% faster)

def test_multiple_modifiers_order2():
    # Test parsing multiple valid modifiers in different order
    codeflash_output = _parse_modifiers("ctrl+alt") # 1.70μs -> 1.19μs (42.6% faster)

def test_all_modifiers():
    # Test parsing all valid modifiers
    codeflash_output = _parse_modifiers("alt+ctrl+shift") # 1.92μs -> 1.42μs (35.0% faster)

def test_all_modifiers_different_order():
    # Test parsing all valid modifiers in a different order
    codeflash_output = _parse_modifiers("shift+alt+ctrl") # 1.83μs -> 1.37μs (33.2% faster)

def test_modifiers_with_spaces():
    # Test parsing modifiers with spaces around plus signs
    codeflash_output = _parse_modifiers(" alt + ctrl + shift ") # 2.07μs -> 1.63μs (26.6% faster)

def test_duplicate_modifiers():
    # Test parsing duplicate modifiers; should only set key once
    codeflash_output = _parse_modifiers("alt+alt+ctrl") # 1.86μs -> 1.36μs (36.7% faster)

# ---------------------------
# Edge Test Cases
# ---------------------------

def test_empty_string():
    # Test empty string input, should raise ValueError
    with pytest.raises(ValueError):
        _parse_modifiers("") # 1.97μs -> 1.50μs (31.4% faster)

def test_unknown_modifier():
    # Test input with an unknown modifier, should raise ValueError
    with pytest.raises(ValueError):
        _parse_modifiers("meta") # 1.90μs -> 1.47μs (29.4% faster)

def test_known_and_unknown_modifier():
    # Test input mixing known and unknown modifiers, should raise ValueError
    with pytest.raises(ValueError):
        _parse_modifiers("alt+foo") # 2.18μs -> 1.69μs (28.7% faster)

def test_plus_only():
    # Test input with only plus signs, should raise ValueError
    with pytest.raises(ValueError):
        _parse_modifiers("+") # 1.96μs -> 1.45μs (34.9% faster)

def test_trailing_plus():
    # Test input with trailing plus, should raise ValueError
    with pytest.raises(ValueError):
        _parse_modifiers("alt+") # 2.15μs -> 1.68μs (27.9% faster)

def test_leading_plus():
    # Test input with leading plus, should raise ValueError
    with pytest.raises(ValueError):
        _parse_modifiers("+alt") # 1.91μs -> 1.38μs (38.4% faster)

def test_multiple_consecutive_pluses():
    # Test input with multiple consecutive pluses, should raise ValueError
    with pytest.raises(ValueError):
        _parse_modifiers("alt++ctrl") # 2.23μs -> 1.65μs (35.2% faster)

def test_case_sensitivity():
    # Test case sensitivity: should not accept 'Alt', only 'alt'
    with pytest.raises(ValueError):
        _parse_modifiers("Alt") # 1.84μs -> 1.42μs (29.4% faster)

def test_spaces_only():
    # Test input with only spaces, should raise ValueError
    with pytest.raises(ValueError):
        _parse_modifiers("   ") # 1.91μs -> 1.45μs (31.3% faster)

def test_extra_spaces_between_modifiers():
    # Test input with extra spaces between modifiers and pluses
    codeflash_output = _parse_modifiers("  alt  +   ctrl  +  shift  ") # 2.23μs -> 1.65μs (35.3% faster)

def test_modifier_with_internal_spaces():
    # Test input with internal spaces in a modifier, should raise ValueError
    with pytest.raises(ValueError):
        _parse_modifiers("al t") # 1.85μs -> 1.43μs (30.1% faster)

def test_modifier_with_tab_character():
    # Test input with tab character between modifiers
    codeflash_output = _parse_modifiers("alt\t+\tctrl") # 1.84μs -> 1.33μs (38.4% faster)

def test_modifier_with_newline_character():
    # Test input with newline character between modifiers
    codeflash_output = _parse_modifiers("alt\n+\nctrl") # 1.80μs -> 1.28μs (41.2% faster)

def test_modifier_with_mixed_whitespace():
    # Test input with mixed whitespace between modifiers
    codeflash_output = _parse_modifiers(" alt \t +  ctrl\n + shift ") # 2.15μs -> 1.59μs (35.8% faster)

# ---------------------------
# Large Scale Test Cases
# ---------------------------

def test_large_number_of_modifiers_duplicates():
    # Test with a large number of duplicate valid modifiers (should not error, keys should be unique)
    input_str = "+".join(["alt", "ctrl", "shift"] * 100)
    codeflash_output = _parse_modifiers(input_str); result = codeflash_output # 26.2μs -> 22.6μs (15.9% faster)

def test_large_number_of_valid_and_invalid_modifiers():
    # Test with a large number of valid modifiers and one invalid at the end (should raise ValueError)
    input_str = "+".join(["alt", "ctrl", "shift"] * 300 + ["meta"])
    with pytest.raises(ValueError):
        _parse_modifiers(input_str) # 67.2μs -> 58.4μs (15.1% faster)

def test_large_input_with_spaces():
    # Test with a large input string with many spaces and valid modifiers
    input_str = " + ".join(["alt"] * 333 + ["ctrl"] * 333 + ["shift"] * 333)
    codeflash_output = _parse_modifiers(input_str); result = codeflash_output # 98.5μs -> 73.9μs (33.4% faster)

def test_large_input_all_invalid():
    # Test with a large input string of only invalid modifiers (should raise ValueError)
    input_str = "+".join(["foo"] * 999)
    with pytest.raises(ValueError):
        _parse_modifiers(input_str) # 34.9μs -> 10.2μs (241% faster)

def test_large_input_with_some_empty_modifiers():
    # Test with a large input string with empty modifiers between pluses (should raise ValueError)
    input_str = "alt+" + "+".join([""] * 998) + "+ctrl"
    with pytest.raises(ValueError):
        _parse_modifiers(input_str) # 31.2μs -> 7.72μs (304% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from bokeh.models.tools import _parse_modifiers
import pytest

def test__parse_modifiers():
    with pytest.raises(ValueError, match="can't\\ parse\\ 'alt\\+'\\ key\\ modifiers;\\ unknown\\ ''\\ key"):
        _parse_modifiers('alt+')

def test__parse_modifiers_2():
    _parse_modifiers('ctrl')

def test__parse_modifiers_3():
    _parse_modifiers('shift')
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_sstvtaha/tmpzf35gj5u/test_concolic_coverage.py::test__parse_modifiers 2.32μs 1.81μs 28.3%✅
codeflash_concolic_sstvtaha/tmpzf35gj5u/test_concolic_coverage.py::test__parse_modifiers_2 1.41μs 922ns 53.0%✅
codeflash_concolic_sstvtaha/tmpzf35gj5u/test_concolic_coverage.py::test__parse_modifiers_3 1.38μs 904ns 52.7%✅

To edit these changes git checkout codeflash/optimize-_parse_modifiers-mhx0ngch and push.

Codeflash Static Badge

The optimized code achieves a **41% speedup** by eliminating the memory-intensive list comprehension and moving the `strip()` operation inside the loop.

**Key optimization:**
- **Removed list comprehension**: Changed `keys = [key.strip() for key in value.split("+")]` to `keys = value.split("+")` with `key = key.strip()` inside the loop
- **Memory efficiency**: Avoids creating an intermediate list of stripped strings, reducing memory allocation overhead

**Why this is faster:**
1. **Reduced memory allocations**: The original creates two lists (split result + comprehension result), while the optimized version only creates one
2. **Lazy processing**: Keys are stripped only as they're processed, rather than all upfront
3. **Better cache locality**: Processing one key at a time keeps data access patterns more predictable

**Performance characteristics from tests:**
- **Single modifiers**: 45-60% faster (best case scenario)
- **Multiple modifiers**: 26-43% faster 
- **Invalid inputs**: Up to 304% faster due to early failure before processing all keys
- **Large inputs**: 13-33% faster, with the most dramatic improvements on invalid large inputs (223-304% faster)

The optimization is particularly effective for error cases and small inputs (the common use case for key modifier parsing), where the overhead of list comprehension is most pronounced relative to the actual work being done.
@codeflash-ai codeflash-ai Bot requested a review from mashraf-222 November 13, 2025 05:56
@codeflash-ai codeflash-ai Bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants