Support tag: query shorthand in search #535

phernandez · 2026-02-01T03:02:12Z

Summary

map tag:<value> query prefix to tags metadata filter so tag searches work again
split multiple tags by commas/whitespace and drop text search when tag filter is applied
add positive and negative tests covering tag shorthand and multi-tag behavior across backends

Testing

uv run pytest tests/services/test_search_service.py -k "tag_prefix" -q

Signed-off-by: phernandez <paul@basicmachines.co>

github-actions · 2026-02-01T03:02:27Z

Claude finished @phernandez's task in 3m 37s —— View job

Code Review: Support tag: query shorthand in search

I've completed a thorough review of this PR against the Basic Memory team checklist.

Summary

This PR implements tag:<value> query shorthand for tag filtering in search. The implementation is functionally sound and well-tested for the happy path. However, it needs improvements to meet the project's literate programming standards and documentation requirements.

Files changed:

search_service.py (+11 lines) - Core implementation
test_search_service.py (+75/-9 lines) - Test coverage

Team Checklist Results

✅ Code Quality & Standards

✅ Python 3.12+ type annotations maintained
✅ 100-character line length limit
✅ Follows async patterns
✅ Proper naming conventions
✅ No commented-out code

❌ Missing: Literate programming comments

The new code at search_service.py:84-91 lacks decision point comments required by CLAUDE.md. Per the guidelines, conditionals that materially change behavior need:

Trigger: what condition causes this branch
Why: the rationale
Outcome: what changes downstream

Add comments here →

⚠️ Testing & Documentation

✅ Unit tests added (3 tests covering positive, negative, multi-tag cases)
✅ Test coverage appears maintained
❌ Missing: User-facing documentation - The tag: syntax isn't documented anywhere
⚠️ Missing edge case tests:
- Empty tag value: tag:
- Uppercase prefix: TAG:tier1
- Mixed delimiters: tag:tier1, alpha beta
- Special characters in tags: tag:my-tag_v2.1

Add edge case tests →

✅ Basic Memory Architecture

✅ Uses existing metadata filter infrastructure
✅ Preserves local-first architecture
✅ Knowledge graph consistency maintained
✅ No database schema changes needed

✅ Security & Performance

✅ Input validation via safe regex
✅ No hardcoded secrets
✅ Error handling inherited from existing search
⚠️ Minor regex overhead on every text query (likely negligible)

✅ Compatibility

✅ Windows path compatibility not applicable
✅ No emoji/unicode in console output

Detailed Issues & Questions

1. Input Mutation (`search_service.py:90-91`)

The code mutates the input query object:

query.tags = tags
query.text = None

Question: Is this mutation intentional? If the caller reuses the SearchQuery object, they'll get unexpected behavior on subsequent calls.

2. Text Search Dropped (`search_service.py:91`)

Setting query.text = None completely removes text search when tag filter is applied.

Questions:

Why is text search dropped? The PR description states this happens but doesn't explain the rationale
Should combined queries work? e.g., tag:tier1 important to find tagged entities also containing "important"
This should be documented in code comments explaining the design decision

3. Delimiter Ambiguity (`search_service.py:87`)

The regex r"[,\s]+" treats both commas AND whitespace as tag separators:

tag:tier1,alpha → ["tier1", "alpha"]
tag:tier1 alpha → ["tier1", "alpha"]
tag:tier1, alpha → ["tier1", "alpha"]

Question: Is whitespace as a separator intentional or just a side effect? Should be documented.

4. Case Sensitivity (`search_service.py:86`)

The prefix check is case-insensitive (text.lower().startswith("tag:")), but:

Question: Are tag values case-sensitive? Does tag:Tier1 match an entity with tier1? Tests don't verify this.

5. Test Assertions Could Be Stronger

Test at line 534 verifies results but doesn't verify the query transformation:

results = await search_service.search(SearchQuery(text="tag:tier1"))
assert any(r.permalink == tagged_entity.permalink for r in results)

Suggestion: Also assert that query.tags == ["tier1"] and query.text is None to verify the transformation actually occurred.

Recommendations

🔴 High Priority (Should address before merge)

Add decision point comments per CLAUDE.md literate programming style - explains trigger, why, outcome
Document the feature - add to search service docstring, MCP tool docs, or user guide
Explain text search behavior - add comment explaining why query.text = None

🟡 Medium Priority (Recommended)

Add edge case tests - empty values, uppercase, special chars, mixed delimiters
Clarify case sensitivity - document or test tag value case handling
Verify test coverage - run just coverage to confirm 100% coverage maintained

🟢 Low Priority (Nice to have)

Document delimiter behavior - clarify that both , and whitespace work as separators
Consider combined queries - decide if tag:tier1 searchterm should work
Test assertion improvements - verify query transformation, not just results

Final Assessment

Functionality: ✅ Works as described
Code Quality: ⚠️ Missing literate programming comments (CLAUDE.md requirement)
Testing: ⚠️ Happy path covered, edge cases missing
Documentation: ❌ Feature not documented for users
Architecture: ✅ Follows project patterns

Recommendation: The core implementation is solid and functional. To meet Basic Memory's standards, address the high-priority items (literate programming comments and documentation) before merge. The edge case tests and other medium-priority items would strengthen the implementation but aren't blocking.

Signed-off-by: phernandez <paul@basicmachines.co>

github-actions · 2026-02-01T03:17:19Z

Claude finished @phernandez's task in 2m 18s —— View job

Code Review: Support tag: query shorthand in search

I've completed a thorough review of this PR against the Basic Memory team checklist.

Summary

This PR implements tag:<value> query shorthand for tag filtering in search. The implementation is functionally sound with good test coverage for the happy path. However, it needs improvements to meet the project's literate programming standards and has a few design questions that should be addressed.

Files changed:

search_service.py:84-91 (+11 lines) - Core tag: prefix parsing
postgres_search_repository.py:335 (+1/-1 line) - Postgres CAST syntax fix
test_search_service.py (+75/-9 lines) - Test coverage

Team Checklist Results

✅ Code Quality & Standards

✅ Python 3.12+ type annotations maintained
✅ 100-character line length limit
✅ Follows async patterns
✅ Proper naming conventions
✅ No commented-out code
✅ Import order correct (added re module)

❌ Missing: Literate programming comments

The new code at search_service.py:84-91 lacks decision point comments required by CLAUDE.md. Per the guidelines, conditionals that materially change behavior need:

Trigger: what condition causes this branch
Why: the rationale
Outcome: what changes downstream

Example of what's needed:

# Support tag:<tag> shorthand by mapping to tags filter
if query.text:
    text = query.text.strip()
    # Trigger: query starts with "tag:" prefix (case-insensitive)
    # Why: provide convenient shorthand for tag filtering without verbose metadata syntax
    # Outcome: transforms to tags filter, drops text search to avoid empty results
    if text.lower().startswith("tag:"):
        # Split on both commas and whitespace to support: tag:tier1,alpha or tag:tier1 alpha
        tag_values = re.split(r"[,\s]+", text[4:].strip())
        tags = [t for t in tag_values if t]
        if tags:
            query.tags = tags
            # Outcome: text search is removed since tag prefix consumed entire query
            query.text = None

Add literate programming comments →

⚠️ Testing & Documentation

Testing:

✅ Unit tests added (3 tests covering positive, negative, multi-tag cases)
✅ Tests use descriptive names and docstrings
✅ Test coverage appears maintained at 100%
⚠️ Missing edge case tests:
- Empty tag value: tag: (what happens?)
- Uppercase prefix: TAG:tier1 (works due to .lower() but not verified)
- Mixed delimiters: tag:tier1, alpha beta (regex handles but not tested)
- Special characters in tags: tag:my-tag_v2.1 (common in real use)
- Whitespace variations: tag: tier1, tag:tier1 (strip handles but not tested)

Documentation:

❌ Missing: User-facing documentation - The tag: syntax isn't documented anywhere users would find it:
- Not in MCP tool docstrings (where search is exposed to LLMs)
- Not in CLAUDE.md MCP capabilities section
- Not in service-level docstring for SearchService.search()

Suggestions:

Add to SearchService.search() docstring:

"""Search across all indexed content.

Supports three modes:
1. Exact permalink: finds direct matches for a specific path
2. Pattern match: handles * wildcards in paths
3. Text search: full-text search across title/content
   - Supports tag: shorthand (e.g., "tag:tier1" or "tag:tier1,alpha")

Update CLAUDE.md's "Search & Discovery" section to mention tag: syntax

Add edge case tests →

✅ Basic Memory Architecture

✅ Uses existing metadata filter infrastructure (no schema changes)
✅ Preserves local-first architecture
✅ Knowledge graph consistency maintained
✅ Follows service layer pattern
✅ Postgres fix (CAST syntax) maintains cross-DB compatibility

✅ Security & Performance

✅ Input validation via safe regex (no eval/exec)
✅ No hardcoded secrets
✅ Error handling inherited from existing search
✅ No SQL injection risk (uses existing parameterized queries)
⚠️ Minor regex overhead on every text query (likely negligible but runs on every search)

✅ Compatibility

✅ Windows path compatibility not applicable
✅ No emoji/unicode in console output
✅ Cross-database support maintained (works on both SQLite and Postgres)

Detailed Issues & Questions

1. Input Mutation (`search_service.py:90-91`)

The code mutates the input query object:

query.tags = tags
query.text = None

Issue: If the caller reuses the SearchQuery object, they'll get unexpected behavior on subsequent calls.

Question: Is this mutation intentional? SearchQuery is a Pydantic schema - mutating input objects can lead to surprising behavior.

Recommendation: Either:

Document this mutation behavior clearly in comments/docstring
OR create a new query object (safer): query = query.model_copy(update={'tags': tags, 'text': None})

2. Text Search Dropped (`search_service.py:91`)

Setting query.text = None completely removes text search when tag filter is applied.

Questions:

Why is text search dropped? The PR description states this happens but doesn't explain the rationale
Should combined queries work? e.g., tag:tier1 important to find tagged entities also containing "important"
This design decision should be documented in code comments

Current behavior:

tag:tier1 important → searches for tags ["tier1", "important"] (treats "important" as a tag, not search text)

Possible alternative behavior:

tag:tier1 important → searches for tags=["tier1"] AND text="important"

Recommendation: Add a comment explaining why text is dropped and whether this is the desired long-term behavior.

3. Delimiter Ambiguity (`search_service.py:87`)

The regex r"[,\s]+" treats both commas AND whitespace as tag separators:

tag:tier1,alpha → ["tier1", "alpha"] ✅
tag:tier1 alpha → ["tier1", "alpha"] ❓
tag:tier1, alpha → ["tier1", "alpha"] ✅

Questions:

Is whitespace as a separator intentional or just a side effect?
Should tag:tier1 alpha be one tag with space ("tier1 alpha") or two tags?
If intentional, should be documented

Recommendation: Add comment documenting delimiter behavior:

# Split on both commas and whitespace: "tag:tier1,alpha" or "tag:tier1 alpha" both work
tag_values = re.split(r"[,\s]+", text[4:].strip())

4. Case Sensitivity (`search_service.py:86`)

The prefix check is case-insensitive (text.lower().startswith("tag:")):

tag:tier1 ✅
TAG:tier1 ✅
Tag:tier1 ✅

Questions:

Are tag values case-sensitive?
Does tag:Tier1 match an entity with tier1 in frontmatter?
Should tag values be normalized to lowercase?

Recommendation: Add test verifying case handling for both prefix and tag values.

5. Test Assertions Could Be Stronger

Test at line 534 verifies results but doesn't verify the query transformation:

results = await search_service.search(SearchQuery(text="tag:tier1"))
assert any(r.permalink == tagged_entity.permalink for r in results)

Suggestion: Also assert that the transformation actually occurred:

query = SearchQuery(text="tag:tier1")
results = await search_service.search(query)
assert query.tags == ["tier1"], "tag: prefix should set tags filter"
assert query.text is None, "tag: prefix should clear text search"
assert any(r.permalink == tagged_entity.permalink for r in results)

6. Postgres CAST Fix (`postgres_search_repository.py:335`)

Changed :{tag_param}::jsonb to CAST(:{tag_param} AS jsonb).

Question: Was the :: syntax causing issues with SQLAlchemy parameter binding? This is a good fix for compatibility, but understanding why is helpful.

Note: The CAST() syntax is ANSI SQL standard and more portable than Postgres-specific :: casting.

Recommendations

🔴 High Priority (Should address before merge)

Add decision point comments per CLAUDE.md literate programming style
Document the feature - add to search service docstring and CLAUDE.md
Explain text search behavior - add comment explaining why query.text = None
Document delimiter behavior - clarify that both , and whitespace work as separators

🟡 Medium Priority (Recommended)

Add edge case tests - empty values, uppercase, special chars, mixed delimiters
Clarify case sensitivity - document or test tag value case handling
Consider query mutation - document or avoid mutating input object
Verify test coverage - run just coverage to confirm 100% maintained

🟢 Low Priority (Nice to have)

Test assertion improvements - verify query transformation, not just results
Consider combined queries - decide if tag:tier1 searchterm should work as tag filter + text search

Final Assessment

Functionality: ✅ Works as described
Code Quality: ⚠️ Missing literate programming comments (CLAUDE.md requirement)
Testing: ⚠️ Happy path covered, edge cases missing
Documentation: ❌ Feature not documented for users
Architecture: ✅ Follows project patterns
Security: ✅ No issues identified
Performance: ✅ Negligible impact

Recommendation: The core implementation is solid and functional. To meet Basic Memory's standards, address the high-priority items (literate programming comments and documentation) before merge. The medium-priority items would strengthen the implementation but aren't blocking.

The implementation follows existing patterns well and the Postgres fix is a good catch. With the additions above, this will be a clean, well-documented feature.

| Branch: fix/tag-search-shorthand

fix: support tag: syntax in search

01c9b32

Signed-off-by: phernandez <paul@basicmachines.co>

fix: support tag search on postgres

78a0396

Signed-off-by: phernandez <paul@basicmachines.co>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support tag: query shorthand in search #535

Support tag: query shorthand in search #535

Uh oh!

phernandez commented Feb 1, 2026

Uh oh!

github-actions bot commented Feb 1, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Support tag: query shorthand in search #535

Are you sure you want to change the base?

Support tag: query shorthand in search #535

Uh oh!

Conversation

phernandez commented Feb 1, 2026

Summary

Testing

Uh oh!

github-actions bot commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: Support tag: query shorthand in search

Summary

Team Checklist Results

✅ Code Quality & Standards

⚠️ Testing & Documentation

✅ Basic Memory Architecture

✅ Security & Performance

✅ Compatibility

Detailed Issues & Questions

1. Input Mutation (search_service.py:90-91)

2. Text Search Dropped (search_service.py:91)

3. Delimiter Ambiguity (search_service.py:87)

4. Case Sensitivity (search_service.py:86)

5. Test Assertions Could Be Stronger

Recommendations

🔴 High Priority (Should address before merge)

🟡 Medium Priority (Recommended)

🟢 Low Priority (Nice to have)

Final Assessment

Uh oh!

github-actions bot commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: Support tag: query shorthand in search

Summary

Team Checklist Results

✅ Code Quality & Standards

⚠️ Testing & Documentation

✅ Basic Memory Architecture

✅ Security & Performance

✅ Compatibility

Detailed Issues & Questions

1. Input Mutation (search_service.py:90-91)

2. Text Search Dropped (search_service.py:91)

3. Delimiter Ambiguity (search_service.py:87)

4. Case Sensitivity (search_service.py:86)

5. Test Assertions Could Be Stronger

6. Postgres CAST Fix (postgres_search_repository.py:335)

Recommendations

🔴 High Priority (Should address before merge)

🟡 Medium Priority (Recommended)

🟢 Low Priority (Nice to have)

Final Assessment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Feb 1, 2026 •

edited

Loading

1. Input Mutation (`search_service.py:90-91`)

2. Text Search Dropped (`search_service.py:91`)

3. Delimiter Ambiguity (`search_service.py:87`)

4. Case Sensitivity (`search_service.py:86`)

github-actions bot commented Feb 1, 2026 •

edited

Loading

1. Input Mutation (`search_service.py:90-91`)

2. Text Search Dropped (`search_service.py:91`)

3. Delimiter Ambiguity (`search_service.py:87`)

4. Case Sensitivity (`search_service.py:86`)

6. Postgres CAST Fix (`postgres_search_repository.py:335`)