Skip to content

⚡️ Speed up function contains_tex_string by 88%#154

Open
codeflash-ai[bot] wants to merge 1 commit into
branch-3.9from
codeflash/optimize-contains_tex_string-mhwtjmd8
Open

⚡️ Speed up function contains_tex_string by 88%#154
codeflash-ai[bot] wants to merge 1 commit into
branch-3.9from
codeflash/optimize-contains_tex_string-mhwtjmd8

Conversation

@codeflash-ai
Copy link
Copy Markdown

@codeflash-ai codeflash-ai Bot commented Nov 13, 2025

📄 88% (0.88x) speedup for contains_tex_string in src/bokeh/embed/util.py

⏱️ Runtime : 348 microseconds 185 microseconds (best of 231 runs)

📝 Explanation and details

The optimization moves regex pattern compilation from inside the function to module-level scope, eliminating redundant compilation overhead on every function call.

Key Changes:

  • Pattern pre-compilation: The regex pattern re.compile(f"{dollars}|{braces}|{parens}", flags=re.S) is compiled once at module load time instead of on every function call
  • Reduced per-call overhead: Each call now only performs a pattern search instead of string concatenation + regex compilation + search

Performance Impact:
The line profiler shows the original function spent 71.8% of its time (914.7μs out of 1273μs total) just compiling the regex pattern on each call. The optimized version eliminates this overhead entirely, reducing total runtime from 348μs to 185μs (87% speedup).

Hot Path Considerations:
Based on function_references, this function is called from _model_requires_mathjax() which checks multiple model properties (text annotations, slider titles, axis labels, div/paragraph content) for MathJax requirements during bundle generation. Since this function may be called repeatedly during model processing, the regex compilation overhead compounds significantly.

Test Case Performance:
The optimization shows consistent 100-300% speedups across all test cases, with particularly strong gains on:

  • Simple strings without delimiters (200-300% faster)
  • Large text processing (30-60% faster but still significant absolute time savings)
  • Repeated delimiter detection scenarios (100-170% faster)

This optimization is especially beneficial for applications processing many text elements or large documents, where the cumulative regex compilation cost would be substantial.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 67 Passed
🌀 Generated Regression Tests 109 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
unit/bokeh/embed/test_util__embed.py::Test__tex_helpers.test_contains_tex_string 23.8μs 11.6μs 106%✅
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

import re

# imports
import pytest  # used for our unit tests
from bokeh.embed.util import contains_tex_string

#-----------------------------------------------------------------------------
# Code
#-----------------------------------------------------------------------------

# unit tests

# --- BASIC TEST CASES ---

def test_empty_string():
    # Should return False for empty string
    codeflash_output = contains_tex_string("") # 2.52μs -> 644ns (292% faster)

def test_no_delimiters():
    # Should return False when string contains no delimiters
    codeflash_output = contains_tex_string("This is a plain string.") # 2.59μs -> 748ns (246% faster)

def test_single_dollar_pair():
    # Should return True for string containing $...$
    codeflash_output = contains_tex_string("Here is some math: $x^2$") # 3.22μs -> 1.50μs (114% faster)

def test_single_brace_pair():
    # Should return True for string containing \[...\]
    codeflash_output = contains_tex_string("Here is some math: \\[x^2\\]") # 3.07μs -> 1.37μs (124% faster)

def test_single_paren_pair():
    # Should return True for string containing \(...\)
    codeflash_output = contains_tex_string("Here is some math: \\(x^2\\)") # 3.18μs -> 1.36μs (133% faster)

def test_multiple_delimiters():
    # Should return True if any valid delimiter is present
    codeflash_output = contains_tex_string("First $x$ then \\[y\\] and finally \\(z\\)") # 3.02μs -> 1.26μs (140% faster)

def test_text_between_delimiters():
    # Should return True if delimiters are surrounded by text
    codeflash_output = contains_tex_string("Start $math$ end") # 2.98μs -> 1.23μs (143% faster)

def test_multiple_math_blocks():
    # Should return True if multiple math blocks are present
    codeflash_output = contains_tex_string("$a$ $b$ \\[c\\] \\(d\\)") # 2.85μs -> 1.18μs (142% faster)

def test_nested_delimiters():
    # Should return True even if math blocks are nested (though not valid TeX)
    codeflash_output = contains_tex_string("$x \\[y\\]$") # 2.88μs -> 1.24μs (132% faster)

# --- EDGE TEST CASES ---

def test_unmatched_dollar():
    # Should return False if only one $
    codeflash_output = contains_tex_string("This is a $ math block") # 3.25μs -> 1.56μs (108% faster)

def test_unmatched_brace():
    # Should return False if only one \[
    codeflash_output = contains_tex_string("This is a \\[ math block") # 3.05μs -> 1.43μs (113% faster)

def test_unmatched_paren():
    # Should return False if only one \(
    codeflash_output = contains_tex_string("This is a \\( math block") # 3.21μs -> 1.39μs (131% faster)

def test_escaped_dollar_sign():
    # Should return False for escaped dollar sign not forming a block
    codeflash_output = contains_tex_string("This is a \\$ not a math block") # 3.05μs -> 1.25μs (145% faster)

def test_escaped_brace():
    # Should return False for escaped brace not forming a block
    codeflash_output = contains_tex_string("This is a \\[ not a math block") # 3.17μs -> 1.43μs (122% faster)

def test_escaped_paren():
    # Should return False for escaped paren not forming a block
    codeflash_output = contains_tex_string("This is a \\( not a math block") # 3.21μs -> 1.41μs (127% faster)

def test_dollar_inside_text():
    # Should return False for single $ not forming a block
    codeflash_output = contains_tex_string("Price is $5") # 2.74μs -> 1.07μs (155% faster)

def test_brace_inside_text():
    # Should return False for single [ not forming a block
    codeflash_output = contains_tex_string("List: [a, b, c]") # 2.41μs -> 693ns (248% faster)

def test_paren_inside_text():
    # Should return False for single ( not forming a block
    codeflash_output = contains_tex_string("Function(x)") # 2.46μs -> 654ns (276% faster)

def test_math_block_with_newlines():
    # Should return True for math block spanning multiple lines
    codeflash_output = contains_tex_string("$\nmultiline\nmath\n$") # 3.23μs -> 1.44μs (125% faster)

def test_brace_math_block_with_newlines():
    # Should return True for brace math block spanning multiple lines
    codeflash_output = contains_tex_string("\\[\nmultiline\nmath\n\\]") # 3.21μs -> 1.40μs (129% faster)

def test_paren_math_block_with_newlines():
    # Should return True for paren math block spanning multiple lines
    codeflash_output = contains_tex_string("\\(\nmultiline\nmath\n\\)") # 3.07μs -> 1.39μs (121% faster)

def test_math_block_at_start():
    # Should return True if math block is at the start
    codeflash_output = contains_tex_string("$math$ at the beginning") # 2.92μs -> 1.20μs (144% faster)

def test_math_block_at_end():
    # Should return True if math block is at the end
    codeflash_output = contains_tex_string("At the end $math$") # 2.96μs -> 1.25μs (136% faster)

def test_math_block_only_delimiters():
    # Should return True if string is just delimiters and content
    codeflash_output = contains_tex_string("$math$") # 2.86μs -> 1.19μs (140% faster)
    codeflash_output = contains_tex_string("\\[math\\]") # 1.48μs -> 599ns (148% faster)
    codeflash_output = contains_tex_string("\\(math\\)") # 1.07μs -> 516ns (107% faster)

def test_math_block_empty_content():
    # Should return True if delimiters are present even with empty content
    codeflash_output = contains_tex_string("$$") # 2.66μs -> 1.02μs (162% faster)
    codeflash_output = contains_tex_string("\\[\\]") # 1.36μs -> 556ns (144% faster)
    codeflash_output = contains_tex_string("\\(\\)") # 973ns -> 427ns (128% faster)

def test_overlapping_delimiters():
    # Should return True for overlapping delimiters
    codeflash_output = contains_tex_string("$a$b$c$") # 2.78μs -> 1.03μs (170% faster)

def test_delimiters_with_special_chars():
    # Should return True for math blocks with special characters inside
    codeflash_output = contains_tex_string("$!@#$%^&*()_+{}|:\"<>?$") # 3.00μs -> 1.33μs (126% faster)

def test_non_ascii_characters():
    # Should return True for math blocks containing unicode
    codeflash_output = contains_tex_string("$αβγ$") # 3.35μs -> 1.71μs (96.5% faster)

def test_delimiters_with_escaped_backslashes():
    # Should return True for math blocks with escaped backslashes inside
    codeflash_output = contains_tex_string("$x \\ y$") # 2.78μs -> 1.21μs (129% faster)

def test_math_block_with_only_whitespace():
    # Should return True for math blocks containing only whitespace
    codeflash_output = contains_tex_string("$   $") # 2.84μs -> 1.08μs (163% faster)
    codeflash_output = contains_tex_string("\\[   \\]") # 1.43μs -> 576ns (147% faster)
    codeflash_output = contains_tex_string("\\(   \\)") # 965ns -> 402ns (140% faster)

def test_delimiters_with_regex_metacharacters_inside():
    # Should return True for math blocks containing regex metacharacters
    codeflash_output = contains_tex_string("$.*?+[](){}^$\\$") # 2.84μs -> 1.23μs (131% faster)

def test_false_positive_similar_delimiters():
    # Should return False for similar but invalid delimiters
    codeflash_output = contains_tex_string("$math$") # 2.89μs -> 1.18μs (144% faster)
    codeflash_output = contains_tex_string("[math]") # 1.19μs -> 373ns (219% faster)
    codeflash_output = contains_tex_string("(math)") # 792ns -> 222ns (257% faster)

def test_false_positive_backslash_without_delimiter():
    # Should return False for backslash not followed by delimiter
    codeflash_output = contains_tex_string("This is a \\ test") # 2.81μs -> 1.06μs (164% faster)

# --- LARGE SCALE TEST CASES ---

def test_large_string_no_delimiters():
    # Should return False for large string with no delimiters
    large_text = "a" * 1000
    codeflash_output = contains_tex_string(large_text) # 5.37μs -> 3.67μs (46.5% faster)

def test_large_string_with_delimiter_at_start():
    # Should return True for large string with delimiter at start
    large_text = "$math$" + ("a" * 995)
    codeflash_output = contains_tex_string(large_text) # 3.01μs -> 1.35μs (123% faster)

def test_large_string_with_delimiter_at_end():
    # Should return True for large string with delimiter at end
    large_text = ("a" * 995) + "$math$"
    codeflash_output = contains_tex_string(large_text) # 5.96μs -> 4.30μs (38.7% faster)

def test_large_string_with_delimiter_in_middle():
    # Should return True for large string with delimiter in middle
    large_text = ("a" * 495) + "$math$" + ("b" * 495)
    codeflash_output = contains_tex_string(large_text) # 4.47μs -> 2.77μs (61.6% faster)

def test_large_string_multiple_delimiters():
    # Should return True for large string with multiple delimiters
    large_text = ("a" * 200) + "$math$" + ("b" * 200) + "\\[math\\]" + ("c" * 200) + "\\(math\\)"
    codeflash_output = contains_tex_string(large_text) # 3.57μs -> 1.82μs (96.0% faster)

def test_large_string_only_delimiters():
    # Should return True for string of delimiters repeated
    large_text = "$math$" * 100
    codeflash_output = contains_tex_string(large_text) # 2.86μs -> 1.16μs (146% faster)

def test_large_string_many_non_delimiters():
    # Should return False for large string of similar but invalid delimiters
    large_text = ("$math$" * 100) + ("[math]" * 100) + ("(math)" * 100)
    codeflash_output = contains_tex_string(large_text) # 3.10μs -> 1.41μs (120% faster)

def test_large_string_with_newlines_and_delimiters():
    # Should return True for large string with newlines and delimiters
    large_text = ("\n" * 500) + "$\nmultiline\nmath\n$" + ("\n" * 495)
    codeflash_output = contains_tex_string(large_text) # 4.63μs -> 2.81μs (64.6% faster)

def test_large_string_with_unicode_and_delimiters():
    # Should return True for large string with unicode and delimiters
    large_text = ("αβγ" * 300) + "$δ$" + ("δεζ" * 300)
    codeflash_output = contains_tex_string(large_text) # 7.11μs -> 5.28μs (34.5% faster)

def test_large_string_with_delimiters_and_special_chars():
    # Should return True for large string with delimiters and special chars
    large_text = ("!@#" * 300) + "\\[!@#\\]" + ("$%^" * 300)
    codeflash_output = contains_tex_string(large_text) # 5.92μs -> 4.02μs (47.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from __future__ import annotations

import re

# imports
import pytest  # used for our unit tests
from bokeh.embed.util import contains_tex_string

#-----------------------------------------------------------------------------
# Code
#-----------------------------------------------------------------------------

# ------------------- Unit Tests -------------------

# Basic Test Cases

def test_contains_tex_string_basic_dollars():
    # Should detect $...$
    codeflash_output = contains_tex_string("This is math $x^2$ in text") # 2.96μs -> 1.27μs (134% faster)

def test_contains_tex_string_basic_braces():
    # Should detect \[...\]
    codeflash_output = contains_tex_string("Equation: \\[x^2\\]") # 2.78μs -> 1.29μs (115% faster)

def test_contains_tex_string_basic_parens():
    # Should detect \(...\)
    codeflash_output = contains_tex_string("Inline math: \\(x^2\\)") # 3.04μs -> 1.34μs (127% faster)

def test_contains_tex_string_basic_no_tex():
    # Should return False for string without delimiters
    codeflash_output = contains_tex_string("This is plain text with no math") # 2.49μs -> 758ns (228% faster)

def test_contains_tex_string_basic_multiple_delimiters():
    # Should detect if any delimiter is present
    codeflash_output = contains_tex_string("Here is $x$ and \\[y\\] and \\(z\\)") # 3.02μs -> 1.27μs (138% faster)

def test_contains_tex_string_basic_delimiter_at_start_end():
    # Delimiters at start or end should be detected
    codeflash_output = contains_tex_string("$start$") # 3.01μs -> 1.23μs (146% faster)
    codeflash_output = contains_tex_string("end \\[finish\\]") # 1.51μs -> 691ns (119% faster)
    codeflash_output = contains_tex_string("\\(inline\\)") # 1.06μs -> 519ns (104% faster)

# Edge Test Cases

def test_contains_tex_string_empty_string():
    # Empty string should return False
    codeflash_output = contains_tex_string("") # 2.35μs -> 564ns (317% faster)

def test_contains_tex_string_only_delimiters_no_content():
    # Delimiters with nothing between should be detected
    codeflash_output = contains_tex_string("$$") # 2.77μs -> 1.04μs (165% faster)
    codeflash_output = contains_tex_string("\\[\\]") # 1.34μs -> 561ns (139% faster)
    codeflash_output = contains_tex_string("\\(\\)") # 957ns -> 440ns (118% faster)

def test_contains_tex_string_unmatched_delimiters():
    # Unmatched delimiters should NOT be detected
    codeflash_output = contains_tex_string("$unclosed") # 3.14μs -> 1.39μs (126% faster)
    codeflash_output = contains_tex_string("unopened$") # 1.48μs -> 668ns (121% faster)
    codeflash_output = contains_tex_string("\\[missing end") # 1.16μs -> 610ns (90.7% faster)
    codeflash_output = contains_tex_string("missing start\\]") # 903ns -> 322ns (180% faster)
    codeflash_output = contains_tex_string("\\(missing end") # 1.03μs -> 468ns (120% faster)
    codeflash_output = contains_tex_string("missing start\\)") # 849ns -> 334ns (154% faster)

def test_contains_tex_string_nested_delimiters():
    # Nested delimiters inside should still be detected
    codeflash_output = contains_tex_string("$outer $inner$ outer$") # 2.91μs -> 1.16μs (151% faster)
    codeflash_output = contains_tex_string("\\[\\(nested\\)\\]") # 1.52μs -> 724ns (110% faster)

def test_contains_tex_string_escaped_delimiters():
    # Escaped delimiters should NOT be detected
    codeflash_output = contains_tex_string("This is not math: \\$\\$x\\$\\$") # 3.25μs -> 1.39μs (134% faster)
    codeflash_output = contains_tex_string("Escaped: \\\\[x^2\\\\]") # 1.68μs -> 820ns (105% faster)
    codeflash_output = contains_tex_string("Escaped: \\\\(x^2\\\\)") # 1.15μs -> 576ns (99.3% faster)

def test_contains_tex_string_delimiters_with_newlines():
    # Delimiters spanning multiple lines should be detected
    codeflash_output = contains_tex_string("$\nx^2\n$") # 2.85μs -> 1.09μs (163% faster)
    codeflash_output = contains_tex_string("\\[\nx^2\n\\]") # 1.37μs -> 518ns (165% faster)
    codeflash_output = contains_tex_string("\\(\nx^2\n\\)") # 956ns -> 445ns (115% faster)

def test_contains_tex_string_delimiters_with_special_chars():
    # Content inside delimiters can contain special characters
    codeflash_output = contains_tex_string("$x^2_#@!$") # 2.91μs -> 1.06μs (174% faster)
    codeflash_output = contains_tex_string("\\[x^2_#@!\\]") # 1.31μs -> 572ns (128% faster)
    codeflash_output = contains_tex_string("\\(x^2_#@!\\)") # 958ns -> 448ns (114% faster)

def test_contains_tex_string_partial_delimiter_overlap():
    # Delimiters that overlap but do not match should not be detected
    codeflash_output = contains_tex_string("$start\\[middle\\]end$") # 2.86μs -> 1.21μs (137% faster)
    codeflash_output = contains_tex_string("start\\[middle\\]end") # 1.33μs -> 578ns (130% faster)
    codeflash_output = contains_tex_string("start\\(middle\\)end") # 992ns -> 463ns (114% faster)
    codeflash_output = contains_tex_string("start$middle\\]end") # 1.27μs -> 822ns (54.4% faster)

def test_contains_tex_string_multiple_valid_delimiters():
    # Multiple valid delimiters should be detected
    codeflash_output = contains_tex_string("$a$ $b$") # 2.69μs -> 1.06μs (154% faster)
    codeflash_output = contains_tex_string("\\[a\\] \\[b\\]") # 1.23μs -> 580ns (113% faster)
    codeflash_output = contains_tex_string("\\(a\\) \\(b\\)") # 987ns -> 430ns (130% faster)

def test_contains_tex_string_delimiters_with_whitespace():
    # Delimiters with only whitespace inside should be detected
    codeflash_output = contains_tex_string("$   $") # 2.67μs -> 1.04μs (156% faster)
    codeflash_output = contains_tex_string("\\[   \\]") # 1.28μs -> 531ns (140% faster)
    codeflash_output = contains_tex_string("\\(   \\)") # 934ns -> 381ns (145% faster)

def test_contains_tex_string_delimiters_with_unicode():
    # Unicode inside delimiters should be detected
    codeflash_output = contains_tex_string("$π^2$") # 3.12μs -> 1.60μs (94.6% faster)
    codeflash_output = contains_tex_string("\\[α+β\\]") # 1.38μs -> 600ns (130% faster)
    codeflash_output = contains_tex_string("\\\\)") # 977ns -> 387ns (152% faster)

def test_contains_tex_string_similar_but_not_tex():
    # Similar patterns but not valid delimiters should NOT be detected
    codeflash_output = contains_tex_string("$x$") # 2.25μs -> 551ns (308% faster)
    codeflash_output = contains_tex_string("\\[x]") # 1.69μs -> 976ns (72.6% faster)
    codeflash_output = contains_tex_string("\\(x)") # 992ns -> 432ns (130% faster)
    codeflash_output = contains_tex_string("$$x$$") # 1.05μs -> 517ns (103% faster)

def test_contains_tex_string_delimiters_with_extra_backslashes():
    # Extra backslashes outside valid delimiters should not affect detection
    codeflash_output = contains_tex_string("\\\\[not a delimiter\\\\]") # 3.03μs -> 1.42μs (114% faster)
    codeflash_output = contains_tex_string("\\\\(not a delimiter\\\\)") # 1.61μs -> 770ns (110% faster)

# Large Scale Test Cases

def test_contains_tex_string_large_text_with_delimiter():
    # Large text with a single delimiter somewhere inside
    large_text = "a" * 500 + "$x^2$" + "b" * 500
    codeflash_output = contains_tex_string(large_text) # 4.37μs -> 2.69μs (62.5% faster)

def test_contains_tex_string_large_text_without_delimiter():
    # Large text with no delimiter
    large_text = "a" * 1000
    codeflash_output = contains_tex_string(large_text) # 5.38μs -> 3.61μs (48.9% faster)

def test_contains_tex_string_many_delimiters():
    # Text with many delimiters scattered throughout
    text = ""
    for i in range(100):
        text += f"Some text {i} $x_{i}$ "
    codeflash_output = contains_tex_string(text) # 3.00μs -> 1.27μs (137% faster)

def test_contains_tex_string_large_text_with_newlines_and_delimiters():
    # Large text with newlines and delimiters
    text = "\n".join([f"Line {i}" for i in range(500)]) + "\n$math$\n" + "\n".join([f"Line {i}" for i in range(500)])
    codeflash_output = contains_tex_string(text) # 16.3μs -> 14.5μs (12.5% faster)

def test_contains_tex_string_large_text_many_false_positives():
    # Large text with many $ and \ but no valid delimiters
    text = "$" * 500 + "\\" * 500 + "a" * 500
    codeflash_output = contains_tex_string(text) # 2.88μs -> 1.03μs (180% faster)

def test_contains_tex_string_large_text_with_multiple_types():
    # Large text with all three types of delimiters
    text = "a" * 200 + "$x^2$" + "b" * 200 + "\\[y^2\\]" + "c" * 200 + "\\(z^2\\)" + "d" * 200
    codeflash_output = contains_tex_string(text) # 3.56μs -> 1.84μs (93.1% faster)

def test_contains_tex_string_large_text_with_delimiters_at_edges():
    # Delimiters at the very start and end of a large string
    text = "$start$" + "a" * 996 + "$end$"
    codeflash_output = contains_tex_string(text) # 2.90μs -> 1.13μs (156% faster)

def test_contains_tex_string_large_text_with_only_delimiters():
    # Large string made only of delimiters
    text = "$" * 500
    # This is 500 pairs of $. Should be detected as at least one valid delimiter
    codeflash_output = contains_tex_string(text) # 2.77μs -> 1.09μs (154% faster)

def test_contains_tex_string_large_text_with_delimiters_and_escaped():
    # Large text with both valid and escaped delimiters
    text = ("\\$\\$notmath\\$\\$" * 100) + "$realmath$"
    codeflash_output = contains_tex_string(text) # 17.8μs -> 16.1μs (10.8% faster)

def test_contains_tex_string_large_text_with_multiple_newlines_inside_delimiter():
    # Large delimiter content with many newlines
    text = "$" + "\n".join([str(i) for i in range(900)]) + "$"
    codeflash_output = contains_tex_string(text) # 21.5μs -> 19.6μs (9.70% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from bokeh.embed.util import contains_tex_string

def test_contains_tex_string():
    contains_tex_string('')
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_sstvtaha/tmpgpm146z7/test_concolic_coverage.py::test_contains_tex_string 2.31μs 617ns 274%✅

To edit these changes git checkout codeflash/optimize-contains_tex_string-mhwtjmd8 and push.

Codeflash Static Badge

The optimization moves regex pattern compilation from inside the function to module-level scope, eliminating redundant compilation overhead on every function call. 

**Key Changes:**
- **Pattern pre-compilation**: The regex pattern `re.compile(f"{dollars}|{braces}|{parens}", flags=re.S)` is compiled once at module load time instead of on every function call
- **Reduced per-call overhead**: Each call now only performs a pattern search instead of string concatenation + regex compilation + search

**Performance Impact:**
The line profiler shows the original function spent 71.8% of its time (914.7μs out of 1273μs total) just compiling the regex pattern on each call. The optimized version eliminates this overhead entirely, reducing total runtime from 348μs to 185μs (87% speedup).

**Hot Path Considerations:**
Based on `function_references`, this function is called from `_model_requires_mathjax()` which checks multiple model properties (text annotations, slider titles, axis labels, div/paragraph content) for MathJax requirements during bundle generation. Since this function may be called repeatedly during model processing, the regex compilation overhead compounds significantly.

**Test Case Performance:**
The optimization shows consistent 100-300% speedups across all test cases, with particularly strong gains on:
- Simple strings without delimiters (200-300% faster)
- Large text processing (30-60% faster but still significant absolute time savings)
- Repeated delimiter detection scenarios (100-170% faster)

This optimization is especially beneficial for applications processing many text elements or large documents, where the cumulative regex compilation cost would be substantial.
@codeflash-ai codeflash-ai Bot requested a review from mashraf-222 November 13, 2025 02:37
@codeflash-ai codeflash-ai Bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants