Fix eval regression from PR#76: soften State Storage Rule#83
Open
pkosiec wants to merge 2 commits into
Open
Conversation
PR#76 introduced an aggressive State Storage Rule that auto-detected
Lakebase need for any app mentioning state-like terms (preferences,
bookmarks, etc.), causing 16 app regressions in the May 19 nightly eval.
Analytics apps like property_search_app and host_onboarding_checklist
were incorrectly pushed toward Lakebase, dropping to 0.00.
Changes:
- Replace aggressive auto-detect with softer guidance that asks the user
- Remove "preferences, bookmarks" from trigger list (too broad)
- Restore user agency ("Ask the user" vs "Do not wait for the user")
- Explicitly exclude analytics/dashboard apps from Lakebase push
- Revert description metadata and Decision Gate skip clause
- Still recommends Lakebase for genuine CRUD/state storage needs
Fixes: LKB-12991
Co-authored-by: Isaac
The previous commit removed the State Storage reference from the Required Reading table entirely. This creates a discoverability gap: agents jumping from the table to the Decision Gate skip the State Storage Guidance section. Restore it with softer "review" language (vs the old aggressive "evaluate the State Storage Rule") to preserve the flow for CRUD apps that need Lakebase without re-triggering the regression on analytics apps. Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #76 introduced an aggressive State Storage Rule in
skills/databricks-apps/SKILL.mdthat caused 16 app regressions in the May 19 nightly eval (8 high-impact with >0.3 score drop). Analytics apps likeproperty_search_app,host_onboarding_checklist, andcb_brickhouse_advancedwere incorrectly pushed toward Lakebase, dropping to 0.00.Root Cause
The State Storage Rule auto-detected Lakebase need for any app mentioning state-like terms ("preferences", "bookmarks"), with forceful language ("Do not wait for the user to ask", "This is not optional") that removed user agency.
Changes (4 targeted edits in
skills/databricks-apps/SKILL.md)What's kept from PR#76
databricks-lakebase/SKILL.mdimprovements (new references, JSON path table, pgvector)lakebase.mdreference updates (Chat Persistence Pattern, onPluginsReady, naming conventions)model-serving.mdchanges (Model Serving apps actually improved)Test plan
python3 scripts/skills.py validatepassesFixes: LKB-12991
This pull request and its description were written by Isaac.