Skip to content

⚡️ Speed up method Datetime.transform by 26%#149

Open
codeflash-ai[bot] wants to merge 1 commit into
branch-3.9from
codeflash/optimize-Datetime.transform-mhwqli49
Open

⚡️ Speed up method Datetime.transform by 26%#149
codeflash-ai[bot] wants to merge 1 commit into
branch-3.9from
codeflash/optimize-Datetime.transform-mhwqli49

Conversation

@codeflash-ai
Copy link
Copy Markdown

@codeflash-ai codeflash-ai Bot commented Nov 13, 2025

📄 26% (0.26x) speedup for Datetime.transform in src/bokeh/core/property/datetime.py

⏱️ Runtime : 125 microseconds 99.3 microseconds (best of 65 runs)

📝 Explanation and details

The optimized code achieves a 25% speedup through three key optimizations:

1. Eliminated expensive timetuple() call in convert_date_to_datetime
The original code used dt.datetime(*obj.timetuple()[:6], tzinfo=dt.timezone.utc) which involves creating a tuple and unpacking it. The optimized version directly uses dt.datetime(obj.year, obj.month, obj.day, tzinfo=dt.timezone.utc), avoiding the tuple creation overhead.

2. Added fast path for datetime.datetime objects in convert_date_to_datetime
The optimized version first checks if the input is already a datetime.datetime and handles timezone conversion efficiently using replace() or astimezone(). This eliminates unnecessary object creation when the input is already the desired type.

3. Reordered type checks in Datetime.transform for better flow control
The optimized version checks for datetime.datetime first (most specific type), then handles string conversion, and finally checks for datetime.date. This reduces redundant isinstance calls and creates more direct execution paths.

4. Added __slots__ to the Property class
This reduces memory overhead per instance by preventing dynamic attribute creation, though this has minimal impact on the measured performance.

The performance improvements are most significant for datetime.datetime inputs (up to 106% faster) and datetime.date inputs (up to 47.6% faster) based on the test results. The optimizations particularly benefit workloads that frequently convert datetime objects, as the fast paths avoid expensive operations like timetuple() and redundant type conversions. String parsing remains largely unchanged, showing only modest improvements due to the streamlined control flow.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 25 Passed
🌀 Generated Regression Tests 156 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
unit/bokeh/core/property/test_datetime.py::Test_Datetime.test_transform_date 9.12μs 6.24μs 46.3%✅
unit/bokeh/core/property/test_datetime.py::Test_Datetime.test_transform_str 6.65μs 6.06μs 9.72%✅
🌀 Generated Regression Tests and Runtime
import datetime
# function to test (from bokeh/core/property/datetime.py)
from typing import Any

# imports
import pytest
from bokeh.core.property.datetime import Datetime

# unit tests

# --------------------
# Basic Test Cases
# --------------------

def test_transform_iso_datetime_string():
    # Should parse ISO string to datetime, then to milliseconds since epoch
    dt_prop = Datetime()
    iso_str = "2021-07-23T14:55:00"
    # The output should be milliseconds since epoch
    expected = convert_date_to_datetime(datetime.datetime.fromisoformat(iso_str))
    codeflash_output = dt_prop.transform(iso_str); result = codeflash_output

def test_transform_datetime_object():
    # Should convert datetime.datetime to ms since epoch
    dt_prop = Datetime()
    dt_obj = datetime.datetime(2022, 1, 1, 12, 0, 0, tzinfo=datetime.timezone.utc)
    expected = convert_date_to_datetime(dt_obj)
    codeflash_output = dt_prop.transform(dt_obj); result = codeflash_output

def test_transform_date_object():
    # Should convert datetime.date to ms since epoch
    dt_prop = Datetime()
    date_obj = datetime.date(2022, 1, 1)
    expected = convert_date_to_datetime(date_obj)
    codeflash_output = dt_prop.transform(date_obj); result = codeflash_output

def test_transform_datetime_object_naive():
    # Should treat naive datetime as UTC
    dt_prop = Datetime()
    dt_obj = datetime.datetime(2022, 1, 1, 0, 0, 0)
    expected = convert_date_to_datetime(dt_obj)
    codeflash_output = dt_prop.transform(dt_obj); result = codeflash_output

def test_transform_datetime_object_with_timezone():
    # Should handle datetime with timezone correctly
    dt_prop = Datetime()
    dt_obj = datetime.datetime(2022, 1, 1, 0, 0, 0, tzinfo=datetime.timezone(datetime.timedelta(hours=2)))
    # convert_date_to_datetime always sets tzinfo to UTC, so we need to convert to UTC
    dt_obj_utc = dt_obj.astimezone(datetime.timezone.utc)
    expected = convert_date_to_datetime(dt_obj_utc)
    codeflash_output = dt_prop.transform(dt_obj); result = codeflash_output

# --------------------
# Edge Test Cases
# --------------------

def test_transform_empty_string():
    # Should raise ValueError for empty string
    dt_prop = Datetime()
    with pytest.raises(ValueError):
        dt_prop.transform("") # 3.03μs -> 3.44μs (11.8% slower)

def test_transform_invalid_string():
    # Should raise ValueError for non-ISO string
    dt_prop = Datetime()
    with pytest.raises(ValueError):
        dt_prop.transform("not a date") # 2.40μs -> 2.58μs (6.87% slower)

def test_transform_none():
    # Should return None unchanged
    dt_prop = Datetime()
    codeflash_output = dt_prop.transform(None); result = codeflash_output # 1.17μs -> 1.24μs (5.97% slower)

def test_transform_integer():
    # Should return integer unchanged
    dt_prop = Datetime()
    codeflash_output = dt_prop.transform(1234567890); result = codeflash_output # 1.03μs -> 1.04μs (0.866% slower)

def test_transform_float():
    # Should return float unchanged
    dt_prop = Datetime()
    codeflash_output = dt_prop.transform(12345.678); result = codeflash_output # 1.01μs -> 1.09μs (7.17% slower)

def test_transform_boolean():
    # Should return boolean unchanged
    dt_prop = Datetime()
    codeflash_output = dt_prop.transform(True); result = codeflash_output # 1.06μs -> 1.16μs (8.65% slower)

def test_transform_list():
    # Should return list unchanged
    dt_prop = Datetime()
    codeflash_output = dt_prop.transform([1,2,3]); result = codeflash_output # 980ns -> 1.02μs (4.30% slower)

def test_transform_dict():
    # Should return dict unchanged
    dt_prop = Datetime()
    codeflash_output = dt_prop.transform({"date":"2022-01-01"}); result = codeflash_output # 981ns -> 1.04μs (5.49% slower)

def test_transform_datetime_min_max():
    # Should handle minimum and maximum datetime values
    dt_prop = Datetime()
    min_dt = datetime.datetime.min
    max_dt = datetime.datetime.max.replace(tzinfo=datetime.timezone.utc)
    min_expected = convert_date_to_datetime(min_dt)
    max_expected = convert_date_to_datetime(max_dt)
    codeflash_output = dt_prop.transform(min_dt); min_result = codeflash_output
    codeflash_output = dt_prop.transform(max_dt); max_result = codeflash_output

def test_transform_date_min_max():
    # Should handle minimum and maximum date values
    dt_prop = Datetime()
    min_date = datetime.date.min
    max_date = datetime.date.max
    min_expected = convert_date_to_datetime(min_date)
    max_expected = convert_date_to_datetime(max_date)
    codeflash_output = dt_prop.transform(min_date); min_result = codeflash_output
    codeflash_output = dt_prop.transform(max_date); max_result = codeflash_output

def test_transform_leap_year():
    # Should handle leap year dates
    dt_prop = Datetime()
    leap_date = datetime.date(2020, 2, 29)
    expected = convert_date_to_datetime(leap_date)
    codeflash_output = dt_prop.transform(leap_date); result = codeflash_output

def test_transform_dst_transition():
    # Should handle DST transition (if tz-aware)
    dt_prop = Datetime()
    # DST transition in UTC is not relevant, but test with a tz-aware datetime
    dt_obj = datetime.datetime(2021, 3, 14, 2, 30, 0, tzinfo=datetime.timezone(datetime.timedelta(hours=-5)))  # US Eastern
    dt_obj_utc = dt_obj.astimezone(datetime.timezone.utc)
    expected = convert_date_to_datetime(dt_obj_utc)
    codeflash_output = dt_prop.transform(dt_obj); result = codeflash_output

# --------------------
# Large Scale Test Cases
# --------------------

def test_transform_large_list_of_iso_strings():
    # Should process a large list of ISO strings efficiently
    dt_prop = Datetime()
    base_date = datetime.datetime(2021, 1, 1, 0, 0, 0)
    iso_strings = [
        (base_date + datetime.timedelta(days=i)).isoformat()
        for i in range(1000)
    ]
    # Transform each and check
    results = [dt_prop.transform(s) for s in iso_strings]
    expected = [
        convert_date_to_datetime(datetime.datetime.fromisoformat(s))
        for s in iso_strings
    ]
    for r, e in zip(results, expected):
        pass

def test_transform_large_list_of_date_objects():
    # Should process a large list of date objects efficiently
    dt_prop = Datetime()
    base_date = datetime.date(2021, 1, 1)
    dates = [base_date + datetime.timedelta(days=i) for i in range(1000)]
    results = [dt_prop.transform(d) for d in dates]
    expected = [convert_date_to_datetime(d) for d in dates]
    for r, e in zip(results, expected):
        pass

def test_transform_large_list_of_datetime_objects():
    # Should process a large list of datetime objects efficiently
    dt_prop = Datetime()
    base_dt = datetime.datetime(2021, 1, 1, 0, 0, 0)
    datetimes = [base_dt + datetime.timedelta(hours=i) for i in range(1000)]
    results = [dt_prop.transform(dt) for dt in datetimes]
    expected = [convert_date_to_datetime(dt) for dt in datetimes]
    for r, e in zip(results, expected):
        pass

def test_transform_large_list_of_mixed_types():
    # Should process a large list of mixed types efficiently
    dt_prop = Datetime()
    base_dt = datetime.datetime(2021, 1, 1, 0, 0, 0)
    mixed = []
    for i in range(333):
        mixed.append((base_dt + datetime.timedelta(days=i)).isoformat())  # ISO string
        mixed.append(base_dt.date() + datetime.timedelta(days=i))         # date
        mixed.append(base_dt + datetime.timedelta(days=i))                # datetime
    results = [dt_prop.transform(x) for x in mixed]
    expected = []
    for i in range(333):
        expected.append(convert_date_to_datetime(datetime.datetime.fromisoformat((base_dt + datetime.timedelta(days=i)).isoformat())))
        expected.append(convert_date_to_datetime(base_dt.date() + datetime.timedelta(days=i)))
        expected.append(convert_date_to_datetime(base_dt + datetime.timedelta(days=i)))
    for r, e in zip(results, expected):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import datetime
# function to test (from bokeh/core/property/datetime.py)
from typing import Any

# imports
import pytest
from bokeh.core.property.datetime import Datetime
from bokeh.core.property.singletons import Undefined
from bokeh.util.serialization import convert_date_to_datetime


# Helper for expected milliseconds since epoch
def ms_since_epoch(dt_obj):
    epoch = datetime.datetime(1970, 1, 1, tzinfo=datetime.timezone.utc)
    if dt_obj.tzinfo is None:
        dt_obj = dt_obj.replace(tzinfo=datetime.timezone.utc)
    return (dt_obj - epoch).total_seconds() * 1000

# ------------------------------
# Basic Test Cases
# ------------------------------

def test_transform_iso_string_datetime():
    # Test ISO string to datetime conversion
    d = Datetime()
    iso = "2024-06-01T12:34:56"
    codeflash_output = d.transform(iso); result = codeflash_output # 6.09μs -> 5.28μs (15.4% faster)
    # Should match ms_since_epoch
    expected = ms_since_epoch(datetime.datetime.fromisoformat(iso))

def test_transform_iso_string_date():
    # Test ISO string date (YYYY-MM-DD)
    d = Datetime()
    iso = "2024-06-01"
    codeflash_output = d.transform(iso); result = codeflash_output # 5.54μs -> 4.50μs (23.0% faster)
    expected = ms_since_epoch(datetime.datetime(2024, 6, 1))

def test_transform_datetime_object():
    # Test datetime.datetime input
    d = Datetime()
    dt_obj = datetime.datetime(2024, 6, 1, 12, 34, 56)
    codeflash_output = d.transform(dt_obj); result = codeflash_output # 4.75μs -> 3.53μs (34.6% faster)
    expected = ms_since_epoch(dt_obj)

def test_transform_date_object():
    # Test datetime.date input
    d = Datetime()
    date_obj = datetime.date(2024, 6, 1)
    codeflash_output = d.transform(date_obj); result = codeflash_output # 4.86μs -> 3.36μs (44.6% faster)
    expected = ms_since_epoch(datetime.datetime(2024, 6, 1))

def test_transform_datetime_with_timezone():
    # Test datetime with timezone
    d = Datetime()
    dt_obj = datetime.datetime(2024, 6, 1, 12, 34, 56, tzinfo=datetime.timezone.utc)
    codeflash_output = d.transform(dt_obj); result = codeflash_output # 5.56μs -> 2.84μs (95.7% faster)
    expected = ms_since_epoch(dt_obj)

# ------------------------------
# Edge Test Cases
# ------------------------------

def test_transform_invalid_string():
    # Should raise ValueError for non-ISO string
    d = Datetime()
    with pytest.raises(ValueError):
        d.transform("not-a-date") # 2.42μs -> 2.54μs (4.77% slower)

def test_transform_none():
    # Should raise TypeError for None input
    d = Datetime()
    with pytest.raises(TypeError):
        d.transform(None)

def test_transform_integer_input():
    # Should raise TypeError for integer input
    d = Datetime()
    with pytest.raises(TypeError):
        d.transform(1234567890)

def test_transform_float_input():
    # Should raise TypeError for float input
    d = Datetime()
    with pytest.raises(TypeError):
        d.transform(12345.678)

def test_transform_empty_string():
    # Should raise ValueError for empty string
    d = Datetime()
    with pytest.raises(ValueError):
        d.transform("") # 2.91μs -> 3.32μs (12.4% slower)

def test_transform_leap_year_date():
    # Test leap year date
    d = Datetime()
    date_obj = datetime.date(2020, 2, 29)
    codeflash_output = d.transform(date_obj); result = codeflash_output # 8.59μs -> 5.82μs (47.6% faster)
    expected = ms_since_epoch(datetime.datetime(2020, 2, 29))

def test_transform_min_datetime():
    # Test minimum datetime
    d = Datetime()
    dt_obj = datetime.datetime.min.replace(tzinfo=datetime.timezone.utc)
    codeflash_output = d.transform(dt_obj); result = codeflash_output # 6.53μs -> 3.38μs (93.3% faster)
    expected = ms_since_epoch(dt_obj)

def test_transform_max_datetime():
    # Test maximum datetime
    d = Datetime()
    dt_obj = datetime.datetime.max.replace(tzinfo=datetime.timezone.utc)
    codeflash_output = d.transform(dt_obj); result = codeflash_output # 5.79μs -> 2.93μs (97.4% faster)
    expected = ms_since_epoch(dt_obj)

def test_transform_date_object_on_epoch():
    # Test date at Unix epoch
    d = Datetime()
    date_obj = datetime.date(1970, 1, 1)
    codeflash_output = d.transform(date_obj); result = codeflash_output # 5.17μs -> 3.69μs (40.2% faster)
    expected = 0.0

def test_transform_datetime_object_on_epoch():
    # Test datetime at Unix epoch
    d = Datetime()
    dt_obj = datetime.datetime(1970, 1, 1, tzinfo=datetime.timezone.utc)
    codeflash_output = d.transform(dt_obj); result = codeflash_output # 5.25μs -> 2.55μs (106% faster)
    expected = 0.0

def test_transform_iso_string_with_timezone():
    # Test ISO string with timezone
    d = Datetime()
    iso = "2024-06-01T12:34:56+00:00"
    codeflash_output = d.transform(iso); result = codeflash_output # 6.75μs -> 4.07μs (66.1% faster)
    expected = ms_since_epoch(datetime.datetime.fromisoformat(iso))

def test_transform_iso_string_with_microseconds():
    # Test ISO string with microseconds
    d = Datetime()
    iso = "2024-06-01T12:34:56.123456"
    codeflash_output = d.transform(iso); result = codeflash_output # 5.62μs -> 5.14μs (9.28% faster)
    expected = ms_since_epoch(datetime.datetime.fromisoformat(iso))

def test_transform_datetime_with_microseconds():
    # Test datetime with microseconds
    d = Datetime()
    dt_obj = datetime.datetime(2024, 6, 1, 12, 34, 56, 123456)
    codeflash_output = d.transform(dt_obj); result = codeflash_output # 5.00μs -> 3.79μs (31.7% faster)
    expected = ms_since_epoch(dt_obj)

# ------------------------------
# Large Scale Test Cases
# ------------------------------

def test_transform_large_list_of_iso_strings():
    # Test a list of 1000 ISO strings
    d = Datetime()
    base = datetime.datetime(2024, 1, 1)
    iso_list = [(base + datetime.timedelta(days=i)).isoformat() for i in range(1000)]
    results = [d.transform(iso) for iso in iso_list]

def test_transform_large_list_of_date_objects():
    # Test a list of 1000 date objects
    d = Datetime()
    base = datetime.date(2024, 1, 1)
    date_list = [base + datetime.timedelta(days=i) for i in range(1000)]
    results = [d.transform(date_obj) for date_obj in date_list]

def test_transform_large_list_of_datetime_objects():
    # Test a list of 1000 datetime objects
    d = Datetime()
    base = datetime.datetime(2024, 1, 1, 0, 0, 0)
    dt_list = [base + datetime.timedelta(minutes=i) for i in range(1000)]
    results = [d.transform(dt_obj) for dt_obj in dt_list]

def test_transform_large_list_of_iso_strings_with_microseconds():
    # Test a list of 1000 ISO strings with microseconds
    d = Datetime()
    base = datetime.datetime(2024, 1, 1, 0, 0, 0)
    iso_list = [(base + datetime.timedelta(seconds=i, microseconds=i)).isoformat() for i in range(1000)]
    results = [d.transform(iso) for iso in iso_list]

# ------------------------------
# Additional Edge/Mutation Tests
# ------------------------------

@pytest.mark.parametrize("input_val", [
    [], {}, set(), object(), lambda x: x, b"2024-06-01", True, False
])
def test_transform_unexpected_types(input_val):
    # Should raise TypeError for unexpected types
    d = Datetime()
    with pytest.raises(TypeError):
        d.transform(input_val)

def test_transform_iso_string_with_invalid_date():
    # Should raise ValueError for invalid date in ISO string
    d = Datetime()
    with pytest.raises(ValueError):
        d.transform("2024-02-30T12:00:00") # 3.07μs -> 3.35μs (8.39% slower)

def test_transform_iso_string_with_invalid_time():
    # Should raise ValueError for invalid time in ISO string
    d = Datetime()
    with pytest.raises(ValueError):
        d.transform("2024-06-01T25:00:00") # 2.23μs -> 2.28μs (2.02% slower)

def test_transform_iso_string_with_partial_date():
    # Should raise ValueError for partial date string
    d = Datetime()
    with pytest.raises(ValueError):
        d.transform("2024-06") # 2.32μs -> 2.45μs (5.15% slower)

def test_transform_iso_string_with_extra_characters():
    # Should raise ValueError for string with extra characters
    d = Datetime()
    with pytest.raises(ValueError):
        d.transform("2024-06-01T12:34:56abc") # 2.31μs -> 2.41μs (4.35% slower)

def test_transform_iso_string_with_leading_trailing_spaces():
    # Should raise ValueError for string with spaces
    d = Datetime()
    with pytest.raises(ValueError):
        d.transform(" 2024-06-01T12:34:56 ") # 2.26μs -> 2.38μs (4.67% slower)

def test_transform_iso_string_with_tab_character():
    # Should raise ValueError for string with tab character
    d = Datetime()
    with pytest.raises(ValueError):
        d.transform("\t2024-06-01T12:34:56") # 2.26μs -> 2.37μs (4.44% slower)

def test_transform_iso_string_with_newline_character():
    # Should raise ValueError for string with newline character
    d = Datetime()
    with pytest.raises(ValueError):
        d.transform("2024-06-01T12:34:56\n") # 2.34μs -> 2.46μs (4.84% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-Datetime.transform-mhwqli49 and push.

Codeflash Static Badge

The optimized code achieves a 25% speedup through three key optimizations:

**1. Eliminated expensive `timetuple()` call in `convert_date_to_datetime`**
The original code used `dt.datetime(*obj.timetuple()[:6], tzinfo=dt.timezone.utc)` which involves creating a tuple and unpacking it. The optimized version directly uses `dt.datetime(obj.year, obj.month, obj.day, tzinfo=dt.timezone.utc)`, avoiding the tuple creation overhead.

**2. Added fast path for `datetime.datetime` objects in `convert_date_to_datetime`**
The optimized version first checks if the input is already a `datetime.datetime` and handles timezone conversion efficiently using `replace()` or `astimezone()`. This eliminates unnecessary object creation when the input is already the desired type.

**3. Reordered type checks in `Datetime.transform` for better flow control**
The optimized version checks for `datetime.datetime` first (most specific type), then handles string conversion, and finally checks for `datetime.date`. This reduces redundant `isinstance` calls and creates more direct execution paths.

**4. Added `__slots__` to the `Property` class**
This reduces memory overhead per instance by preventing dynamic attribute creation, though this has minimal impact on the measured performance.

The performance improvements are most significant for `datetime.datetime` inputs (up to 106% faster) and `datetime.date` inputs (up to 47.6% faster) based on the test results. The optimizations particularly benefit workloads that frequently convert datetime objects, as the fast paths avoid expensive operations like `timetuple()` and redundant type conversions. String parsing remains largely unchanged, showing only modest improvements due to the streamlined control flow.
@codeflash-ai codeflash-ai Bot requested a review from mashraf-222 November 13, 2025 01:15
@codeflash-ai codeflash-ai Bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants