Skip to content

⚡️ Speed up function _sphinx_type by 1,262%#151

Open
codeflash-ai[bot] wants to merge 1 commit into
branch-3.9from
codeflash/optimize-_sphinx_type-mhwsoksy
Open

⚡️ Speed up function _sphinx_type by 1,262%#151
codeflash-ai[bot] wants to merge 1 commit into
branch-3.9from
codeflash/optimize-_sphinx_type-mhwsoksy

Conversation

@codeflash-ai
Copy link
Copy Markdown

@codeflash-ai codeflash-ai Bot commented Nov 13, 2025

📄 1,262% (12.62x) speedup for _sphinx_type in src/bokeh/core/property/enum.py

⏱️ Runtime : 22.3 milliseconds 1.64 milliseconds (best of 112 runs)

📝 Explanation and details

The optimization replaces an expensive O(n) lookup pattern with O(1) dictionary lookups by precomputing a reverse mapping.

Key optimization: The original code uses obj._enum in enums.__dict__.values() which iterates through all enum values for every call, followed by another O(n) loop to find the matching name. This creates O(n²) behavior when called repeatedly.

What changed:

  • Precomputed reverse mapping: _enum_id_to_name = {id(value): name for name, value in enums.__dict__.items()} creates a one-time mapping from object IDs to names
  • Fast set lookup: _enum_ids = set(_enum_id_to_name.keys()) enables O(1) membership testing
  • Replaced expensive operations: obj._enum in enums.__dict__.values() becomes id(obj._enum) in _enum_ids (O(1) instead of O(n))
  • Direct name retrieval: _enum_id_to_name[enum_id] replaces the O(n) loop that searched for the matching name

Why it's faster: The line profiler shows the original if obj._enum in enums.__dict__.values() took 102ms (91.8% of total time). The optimized version reduces this to just 0.77ms (7.1% of total time) - a 13x speedup on that single line.

Performance characteristics: The optimization is most effective for:

  • Repeated calls with known enums (1300%+ speedups in bulk tests)
  • Mixed workloads with both known/unknown enums (1300%+ speedups)
  • Any scenario where the same enum types are processed multiple times

The 1262% overall speedup demonstrates this is particularly valuable for documentation generation or validation workflows that process many enum properties.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 3515 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from bokeh.core.property.enum import _sphinx_type

# function to test (minimal, self-contained version for testing)
# We'll define minimal stubs for Enum, enums, model_link, property_link, and register_type_link.

# --- Minimal stubs and helpers ---

# Simulate the enums module with some enum classes
class DummyEnumA:
    __module__ = "bokeh.core.enums"
    def __str__(self): return "DummyEnumA"

class DummyEnumB:
    __module__ = "bokeh.core.enums"
    def __str__(self): return "DummyEnumB"

class NotInEnums:
    __module__ = "other.module"
    def __str__(self): return "NotInEnums"

# Simulate bokeh.core.enums
class enums:
    DummyA = DummyEnumA
    DummyB = DummyEnumB

# Simulate register_type_link decorator (no-op)
def register_type_link(cls):
    def decorator(fn):
        return fn
    return decorator

# Minimal Enum class for test
class Enum:
    def __init__(self, enum_type):
        self._enum = enum_type
from bokeh.core.property.enum import _sphinx_type

# --- Unit Tests ---

# 1. Basic Test Cases




def test_property_link_format():
    """Test that property_link is formatted correctly."""
    obj = Enum(DummyEnumA)
    codeflash_output = _sphinx_type(obj); result = codeflash_output # 13.1μs -> 2.70μs (385% faster)

def test_model_link_format():
    """Test that model_link is formatted correctly for known enums."""
    obj = Enum(DummyEnumA)
    codeflash_output = _sphinx_type(obj); result = codeflash_output # 10.0μs -> 1.84μs (442% faster)

# 2. Edge Test Cases











def test_model_link_performance():
    """Test that model_link is called efficiently for many calls."""
    class DummyEnum:
        __module__ = "bokeh.core.enums"
        def __str__(self): return "DummyEnum"
    enums.DummyEnum = DummyEnum
    for _ in range(500):
        obj = Enum(DummyEnum)
        codeflash_output = _sphinx_type(obj); result = codeflash_output # 3.12ms -> 218μs (1327% faster)
    del enums.DummyEnum
from typing import Any

# imports
import pytest
from bokeh.core.property.enum import _sphinx_type


# Simulate bokeh.core.enums with a few example enums
class ColorEnum:
    __module__ = "bokeh.core.enums"
    def __str__(self): return "ColorEnum"

class SizeEnum:
    __module__ = "bokeh.core.enums"
    def __str__(self): return "SizeEnum"

# Simulate bokeh.core.enums dictionary
class enums:
    Color = ColorEnum
    Size = SizeEnum

# Simulate Enum property class
class Enum:
    def __init__(self, enum_type):
        self._enum = enum_type

# Simulate register_type_link decorator (no-op for testing)
def register_type_link(cls):
    def decorator(fn):
        return fn
    return decorator
from bokeh.core.property.enum import _sphinx_type

# ------------------------
# Unit Tests for _sphinx_type
# ------------------------

# BASIC TEST CASES

def test_basic_known_enum_color():
    # Test with a known enum (ColorEnum) in enums
    enum_obj = Enum(ColorEnum)
    expected = ":class:`~bokeh.core.properties.Enum`\ (:class:`~bokeh.core.enums.Color`\ )"
    codeflash_output = _sphinx_type(enum_obj) # 13.5μs -> 2.54μs (433% faster)

def test_basic_known_enum_size():
    # Test with a known enum (SizeEnum) in enums
    enum_obj = Enum(SizeEnum)
    expected = ":class:`~bokeh.core.properties.Enum`\ (:class:`~bokeh.core.enums.Size`\ )"
    codeflash_output = _sphinx_type(enum_obj) # 10.6μs -> 1.99μs (431% faster)

def test_basic_unknown_enum():
    # Test with an unknown enum (not in enums)
    class FakeEnum:
        __module__ = "my.module"
        def __str__(self): return "FakeEnum"
    enum_obj = Enum(FakeEnum)
    expected = ":class:`~bokeh.core.properties.Enum`\ (FakeEnum)"
    codeflash_output = _sphinx_type(enum_obj) # 10.0μs -> 1.86μs (440% faster)

def test_basic_enum_str_representation():
    # Test with an enum that has a custom str representation
    class CustomEnum:
        __module__ = "custom.module"
        def __str__(self): return "CustomEnumString"
    enum_obj = Enum(CustomEnum)
    expected = ":class:`~bokeh.core.properties.Enum`\ (CustomEnumString)"
    codeflash_output = _sphinx_type(enum_obj) # 9.93μs -> 1.83μs (442% faster)

# EDGE TEST CASES

def test_edge_enum_is_none():
    # Test with None as enum type
    enum_obj = Enum(None)
    expected = ":class:`~bokeh.core.properties.Enum`\ (None)"
    codeflash_output = _sphinx_type(enum_obj) # 9.47μs -> 1.44μs (556% faster)

def test_edge_enum_is_builtin_type():
    # Test with a builtin type as enum
    enum_obj = Enum(int)
    expected = ":class:`~bokeh.core.properties.Enum`\ (<class 'int'>)"
    codeflash_output = _sphinx_type(enum_obj) # 9.76μs -> 1.89μs (418% faster)

def test_edge_enum_is_str():
    # Test with a string as enum
    enum_obj = Enum("NotAnEnum")
    expected = ":class:`~bokeh.core.properties.Enum`\ (NotAnEnum)"
    codeflash_output = _sphinx_type(enum_obj) # 10.1μs -> 1.30μs (674% faster)

def test_edge_enum_is_object_instance():
    # Test with an object instance as enum
    class Dummy: pass
    dummy = Dummy()
    enum_obj = Enum(dummy)
    expected = f":class:`~bokeh.core.properties.Enum`\ ({str(dummy)})"
    codeflash_output = _sphinx_type(enum_obj) # 10.3μs -> 1.98μs (421% faster)

def test_edge_enum_in_enums_dict_but_different_object():
    # Test with an object that has same value as in enums dict but is not the same object
    class ColorEnum2:
        __module__ = "bokeh.core.enums"
        def __str__(self): return "ColorEnum"
    enum_obj = Enum(ColorEnum2)
    expected = ":class:`~bokeh.core.properties.Enum`\ (ColorEnum)"
    codeflash_output = _sphinx_type(enum_obj) # 9.92μs -> 1.78μs (456% faster)

def test_edge_enum_with_special_characters():
    # Test with an enum whose str contains special characters
    class WeirdEnum:
        __module__ = "weird.module"
        def __str__(self): return "WeirdEnum!@#"
    enum_obj = Enum(WeirdEnum)
    expected = ":class:`~bokeh.core.properties.Enum`\ (WeirdEnum!@#)"
    codeflash_output = _sphinx_type(enum_obj) # 9.60μs -> 1.74μs (453% faster)

def test_edge_enum_with_empty_str():
    # Test with an enum whose str is empty
    class EmptyStrEnum:
        __module__ = "empty.module"
        def __str__(self): return ""
    enum_obj = Enum(EmptyStrEnum)
    expected = ":class:`~bokeh.core.properties.Enum`\ ()"
    codeflash_output = _sphinx_type(enum_obj) # 9.61μs -> 1.68μs (473% faster)

# LARGE SCALE TEST CASES

def test_large_scale_many_unique_enums():
    # Test with many unique enum objects not in enums
    class LargeEnum:
        __module__ = "large.module"
        def __init__(self, idx): self.idx = idx
        def __str__(self): return f"LargeEnum{self.idx}"

    for i in range(1000):
        enum_obj = Enum(LargeEnum(i))
        expected = f":class:`~bokeh.core.properties.Enum`\ (LargeEnum{i})"
        codeflash_output = _sphinx_type(enum_obj) # 6.42ms -> 476μs (1246% faster)

def test_large_scale_many_known_enums():
    # Test with many enums that are the same as ColorEnum (should all link to Color)
    for i in range(1000):
        enum_obj = Enum(ColorEnum)
        expected = ":class:`~bokeh.core.properties.Enum`\ (:class:`~bokeh.core.enums.Color`\ )"
        codeflash_output = _sphinx_type(enum_obj) # 6.24ms -> 440μs (1317% faster)

def test_large_scale_mixed_known_and_unknown_enums():
    # Mix known and unknown enums
    class UnknownEnum:
        __module__ = "unknown.module"
        def __init__(self, idx): self.idx = idx
        def __str__(self): return f"UnknownEnum{self.idx}"

    for i in range(500):
        enum_obj_known = Enum(ColorEnum)
        expected_known = ":class:`~bokeh.core.properties.Enum`\ (:class:`~bokeh.core.enums.Color`\ )"
        codeflash_output = _sphinx_type(enum_obj_known) # 3.14ms -> 223μs (1304% faster)

        enum_obj_unknown = Enum(UnknownEnum(i))
        expected_unknown = f":class:`~bokeh.core.properties.Enum`\ (UnknownEnum{i})"
        codeflash_output = _sphinx_type(enum_obj_unknown)

def test_large_scale_enum_with_long_str():
    # Test with an enum whose str is very long
    class LongStrEnum:
        __module__ = "long.module"
        def __str__(self): return "X" * 500
    enum_obj = Enum(LongStrEnum)
    expected = ":class:`~bokeh.core.properties.Enum`\ (" + "X" * 500 + ")"
    codeflash_output = _sphinx_type(enum_obj) # 12.2μs -> 2.06μs (491% faster)

def test_large_scale_enum_with_large_object():
    # Test with a large object as enum (simulate memory pressure)
    class LargeObj:
        __module__ = "largeobj.module"
        def __init__(self):
            self.data = [0]*999
        def __str__(self): return "LargeObj"
    enum_obj = Enum(LargeObj())
    expected = ":class:`~bokeh.core.properties.Enum`\ (LargeObj)"
    codeflash_output = _sphinx_type(enum_obj) # 9.97μs -> 1.68μs (494% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_sphinx_type-mhwsoksy and push.

Codeflash Static Badge

The optimization replaces an expensive O(n) lookup pattern with O(1) dictionary lookups by precomputing a reverse mapping.

**Key optimization:** The original code uses `obj._enum in enums.__dict__.values()` which iterates through all enum values for every call, followed by another O(n) loop to find the matching name. This creates O(n²) behavior when called repeatedly.

**What changed:**
- **Precomputed reverse mapping**: `_enum_id_to_name = {id(value): name for name, value in enums.__dict__.items()}` creates a one-time mapping from object IDs to names
- **Fast set lookup**: `_enum_ids = set(_enum_id_to_name.keys())` enables O(1) membership testing
- **Replaced expensive operations**: `obj._enum in enums.__dict__.values()` becomes `id(obj._enum) in _enum_ids` (O(1) instead of O(n))
- **Direct name retrieval**: `_enum_id_to_name[enum_id]` replaces the O(n) loop that searched for the matching name

**Why it's faster:** The line profiler shows the original `if obj._enum in enums.__dict__.values()` took 102ms (91.8% of total time). The optimized version reduces this to just 0.77ms (7.1% of total time) - a **13x speedup** on that single line.

**Performance characteristics:** The optimization is most effective for:
- **Repeated calls** with known enums (1300%+ speedups in bulk tests)  
- **Mixed workloads** with both known/unknown enums (1300%+ speedups)
- **Any scenario** where the same enum types are processed multiple times

The 1262% overall speedup demonstrates this is particularly valuable for documentation generation or validation workflows that process many enum properties.
@codeflash-ai codeflash-ai Bot requested a review from mashraf-222 November 13, 2025 02:13
@codeflash-ai codeflash-ai Bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants