7.1.5 🔥
7.1.5 brings a major overhaul of the Vector Search API with cleaner and more functional methods, adds Embeddings Generation Tasks for automatic vector embedding creation, and introduces new AI agent capabilities.
If you're new to AI in RavenDB, start here for an in-depth introduction.
PyPi link: https://pypi.org/project/ravendb/7.1.5/
Highlights
Embeddings Generation Tasks
New Embeddings Generation Tasks let you automatically generate vector embeddings from document content. Define which document properties to embed, configure chunking strategies, and RavenDB handles the rest - calling the AI provider, caching embeddings, and keeping them in sync as documents change.
Generated embeddings are stored in dedicated collections (@embeddings) and cached (@embeddings-cache) to avoid redundant provider calls. The task processes documents continuously as they change. You can learn more about embedding generation tasks here.
from ravendb.documents.operations.ai import (
EmbeddingsGenerationConfiguration,
EmbeddingPathConfiguration,
ChunkingOptions,
ChunkingMethod,
AddEmbeddingsGenerationOperation,
)
# Configure chunking - how text is split before sending to the AI provider
chunking = ChunkingOptions(
chunking_method=ChunkingMethod.PLAIN_TEXT_SPLIT_LINES,
max_tokens_per_chunk=2048,
)
# Configure the Embeddings Generation task
config = EmbeddingsGenerationConfiguration(
name="PostEmbeddings",
identifier="post-embeddings",
collection="Posts",
connection_string_name="openai-embeddings",
embeddings_path_configurations=[
EmbeddingPathConfiguration(path="Content", chunking_options=chunking),
EmbeddingPathConfiguration(path="Title", chunking_options=chunking),
],
chunking_options_for_querying=chunking, # Used when embedding search terms
)
# Deploy the task
result = store.maintenance.send(AddEmbeddingsGenerationOperation(config))
print(f"Task created with ID: {result.task_id}")Vector Search API Rework
The Vector Search API has been improved with cleaner, more intuitive method families. There are no breaking changes. The new API provides dedicated methods for each embedding type and use case:
Core methods by embedding type:
# Float32 embeddings (default)
session.query(Product).vector_search("Embedding", vector)
# Int8 quantized embeddings
session.query(Product).vector_search_i8("Embedding", int8_vector)
# Binary (int1) quantized embeddings
session.query(Product).vector_search_i1("Embedding", binary_vector)
# Text-based search (uses embeddings generation task)
session.query(Product).vector_search_text(
"Embedding",
"semantic search query",
embedding_generation_task_identifier="my-embeddings-task"
)Index field methods for querying pre-indexed embeddings:
# Query against index embedding fields
session.query_index("ProductIndex").vector_search_with_field("EmbeddingField", vector)
session.query_index("ProductIndex").vector_search_with_i8_field("EmbeddingField", int8_vector)
session.query_index("ProductIndex").vector_search_with_i1_field("EmbeddingField", binary_vector)
session.query_index("ProductIndex").vector_search_with_text_field("EmbeddingField", "search text")Document-based similarity search:
# Find similar documents based on another document's embeddings
session.query(Product).vector_search_text_for_document(
"Embedding",
document_id="products/123-A",
embedding_generation_task_identifier="my-embeddings-task"
)Base64-encoded vector support:
# For base64-encoded vectors stored in documents
session.query(Product).vector_search_with_base64("Embedding", base64_or_float_vector)
session.query(Product).vector_search_with_base64_i8("Embedding", base64_or_int8_vector)
session.query(Product).vector_search_with_base64_i1("Embedding", base64_or_binary_vector)All methods support optional parameters: minimum_similarity, number_of_candidates, is_exact, and where applicable, target_quantization.
Deprecated methods: The old vector_search_f32_i8, vector_search_f32_i1, vector_search_text_using_task, and similar methods are now deprecated. Use the new unified methods with appropriate parameters instead.
AI Agent Query Tool Options
Query tools now support additional options for controlling LLM access and initial context:
from ravendb.documents.operations.ai.agents import (
AiAgentConfiguration,
AiAgentToolQuery,
AiAgentToolQueryOptions,
)
config = AiAgentConfiguration(
identifier="my-agent",
name="Support Agent",
system_prompt="You help customers with their orders.",
query_tools=[
AiAgentToolQuery(
name="GetRecentOrders",
description="Retrieves recent orders for a customer",
query="from Orders where CustomerId = $customerId order by OrderDate desc limit 5",
options=AiAgentToolQueryOptions(
add_to_initial_context=True, # Run query at conversation start
allow_model_queries=False, # Prevent LLM from invoking directly
),
),
],
)Artificial Actions for AI Agent Conversations
You can now inject artificial tool call responses into AI agent conversations. This is useful for providing context from external systems, simulating tool responses during testing, or pre-populating conversation context:
from ravendb.documents.ai import AiAgentArtificialActionResponse
# Inject an artificial action with its response into the conversation
chat.add_artificial_action_with_response(
AiAgentArtificialActionResponse(
tool_id="weather_lookup_123",
content='{"temperature": 72, "conditions": "sunny"}'
)
)
# The model will see this as if a tool was called and responded
result = chat.run("response")Additional Improvements
- Agent disable support: Agents can now be disabled without deletion via
disabled=TrueinAiAgentConfiguration - VectorQuantizer fix:
to_int8()andto_int1()now returnlist[int]instead ofbytesfor proper JSON serialization - AI connection settings: Added
embeddings_max_concurrent_batchesparameter to all AI provider settings for controlling batch parallelism - GenAI Configuration: Added
enable_tracingfor debugging AI interactions andexpiration_in_secfor cache control - UpdateGenAiOperation: Now supports a
resetparameter to reprocess all documents from scratch - Session improvements: Revisions loaded into session now have
ignore_changes=Trueto prevent accidental change tracking;SessionInfoexposesdatabase_nameproperty - ChunkingOptions validation: Added validation for overlap tokens - only supported with paragraph-based chunking methods
- Python 3.13/3.14 compatibility: Fixed compatibility issues with newer Python versions
- JSONL stream parsing: Added
max_empty_lines_in_jsonl_streamconvention for handling empty lines in streaming responses
PRs
- RDBC-999 VectorQuantizer should return list[int] instead of bytes for proper serialization by @poissoncorp in #262
- RDBC-1012 Implement AiAgentToolQueryOptions by @poissoncorp in #263
- RDBC-1013 Rework Python VectorSearch API by @poissoncorp in #264
- RDBC-978 Python 7.1.4->7.1.5 Sync by @poissoncorp in #265
Full Changelog: 7.1.4...7.1.5