Skip to content

[SVLS-8230] Fix SnapStart cold_start tag using restore_time#1139

Draft
jchrostek-dd wants to merge 1 commit intomainfrom
john/svls-8230
Draft

[SVLS-8230] Fix SnapStart cold_start tag using restore_time#1139
jchrostek-dd wants to merge 1 commit intomainfrom
john/svls-8230

Conversation

@jchrostek-dd
Copy link
Copy Markdown
Contributor

Summary

  • SnapStart restore invocations were incorrectly tagged cold_start=false because the time-threshold heuristic in set_init_tags() compared against sandbox_init_time (set at snapshot creation), which always exceeded the 10s proactive initialization threshold.
  • Fix: track restore_time from PlatformRestoreStart telemetry event and use it instead of sandbox_init_time for SnapStart functions.
  • When restore_time is None (telemetry not yet delivered), assume cold start — the 10s threshold far exceeds telemetry delivery latency, so a missing event means restore and invoke happened close together.
  • Added cold_start=true / cold_start=false integration test assertions for SnapStart restore and warm invocations (Java + .NET).

Changes

File Change
bottlecap/src/config/aws.rs Add is_snapstart() method
bottlecap/src/lifecycle/invocation/processor.rs Add restore_time: Option<DateTime<Utc>> field, set from PlatformRestoreStart, use in set_init_tags()
integration-tests/tests/snapstart.test.ts Add cold_start tag assertions for restore (true) and warm (false) invocations

Test plan

  • SnapStart integration tests pass (Java + .NET) with new cold_start tag assertions
  • Existing on-demand integration tests unaffected
  • cargo check passes
  • CI pipeline passes

https://datadoghq.atlassian.net/browse/SVLS-8230

SnapStart restore invocations were misclassified as proactive_initialization
because sandbox_init_time (from snapshot creation) always exceeded the 10s
threshold. Fix by tracking restore_time from PlatformRestoreStart telemetry
and using it for proactive init detection in SnapStart functions.

When restore_time is None (telemetry not yet delivered), assume cold start
since the restore and invoke happened close together.

https://datadoghq.atlassian.net/browse/SVLS-8230
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant