Release 7.1.5 🔥 · ravendb/ravendb-python-client

7.1.5 brings a major overhaul of the Vector Search API with cleaner and more functional methods, adds Embeddings Generation Tasks for automatic vector embedding creation, and introduces new AI agent capabilities.

If you're new to AI in RavenDB, start here for an in-depth introduction.

PyPi link: https://pypi.org/project/ravendb/7.1.5/

Highlights

Embeddings Generation Tasks

New Embeddings Generation Tasks let you automatically generate vector embeddings from document content. Define which document properties to embed, configure chunking strategies, and RavenDB handles the rest - calling the AI provider, caching embeddings, and keeping them in sync as documents change.

Generated embeddings are stored in dedicated collections (@embeddings) and cached (@embeddings-cache) to avoid redundant provider calls. The task processes documents continuously as they change. You can learn more about embedding generation tasks here.

from ravendb.documents.operations.ai import (
    EmbeddingsGenerationConfiguration,
    EmbeddingPathConfiguration,
    ChunkingOptions,
    ChunkingMethod,
    AddEmbeddingsGenerationOperation,
)

# Configure chunking - how text is split before sending to the AI provider
chunking = ChunkingOptions(
    chunking_method=ChunkingMethod.PLAIN_TEXT_SPLIT_LINES,
    max_tokens_per_chunk=2048,
)

# Configure the Embeddings Generation task
config = EmbeddingsGenerationConfiguration(
    name="PostEmbeddings",
    identifier="post-embeddings",
    collection="Posts",
    connection_string_name="openai-embeddings",
    embeddings_path_configurations=[
        EmbeddingPathConfiguration(path="Content", chunking_options=chunking),
        EmbeddingPathConfiguration(path="Title", chunking_options=chunking),
    ],
    chunking_options_for_querying=chunking,  # Used when embedding search terms
)

# Deploy the task
result = store.maintenance.send(AddEmbeddingsGenerationOperation(config))
print(f"Task created with ID: {result.task_id}")

Vector Search API Rework

The Vector Search API has been improved with cleaner, more intuitive method families. There are no breaking changes. The new API provides dedicated methods for each embedding type and use case:

Core methods by embedding type:

# Float32 embeddings (default)
session.query(Product).vector_search("Embedding", vector)

# Int8 quantized embeddings
session.query(Product).vector_search_i8("Embedding", int8_vector)

# Binary (int1) quantized embeddings
session.query(Product).vector_search_i1("Embedding", binary_vector)

# Text-based search (uses embeddings generation task)
session.query(Product).vector_search_text(
    "Embedding",
    "semantic search query",
    embedding_generation_task_identifier="my-embeddings-task"
)

Index field methods for querying pre-indexed embeddings:

# Query against index embedding fields
session.query_index("ProductIndex").vector_search_with_field("EmbeddingField", vector)
session.query_index("ProductIndex").vector_search_with_i8_field("EmbeddingField", int8_vector)
session.query_index("ProductIndex").vector_search_with_i1_field("EmbeddingField", binary_vector)
session.query_index("ProductIndex").vector_search_with_text_field("EmbeddingField", "search text")

Document-based similarity search:

# Find similar documents based on another document's embeddings
session.query(Product).vector_search_text_for_document(
    "Embedding",
    document_id="products/123-A",
    embedding_generation_task_identifier="my-embeddings-task"
)

Base64-encoded vector support:

# For base64-encoded vectors stored in documents
session.query(Product).vector_search_with_base64("Embedding", base64_or_float_vector)
session.query(Product).vector_search_with_base64_i8("Embedding", base64_or_int8_vector)
session.query(Product).vector_search_with_base64_i1("Embedding", base64_or_binary_vector)

All methods support optional parameters: minimum_similarity, number_of_candidates, is_exact, and where applicable, target_quantization.

Deprecated methods: The old vector_search_f32_i8, vector_search_f32_i1, vector_search_text_using_task, and similar methods are now deprecated. Use the new unified methods with appropriate parameters instead.

AI Agent Query Tool Options

Query tools now support additional options for controlling LLM access and initial context:

from ravendb.documents.operations.ai.agents import (
    AiAgentConfiguration,
    AiAgentToolQuery,
    AiAgentToolQueryOptions,
)

config = AiAgentConfiguration(
    identifier="my-agent",
    name="Support Agent",
    system_prompt="You help customers with their orders.",
    query_tools=[
        AiAgentToolQuery(
            name="GetRecentOrders",
            description="Retrieves recent orders for a customer",
            query="from Orders where CustomerId = $customerId order by OrderDate desc limit 5",
            options=AiAgentToolQueryOptions(
                add_to_initial_context=True,  # Run query at conversation start
                allow_model_queries=False,    # Prevent LLM from invoking directly
            ),
        ),
    ],
)

Artificial Actions for AI Agent Conversations

You can now inject artificial tool call responses into AI agent conversations. This is useful for providing context from external systems, simulating tool responses during testing, or pre-populating conversation context:

from ravendb.documents.ai import AiAgentArtificialActionResponse

# Inject an artificial action with its response into the conversation
chat.add_artificial_action_with_response(
    AiAgentArtificialActionResponse(
        tool_id="weather_lookup_123",
        content='{"temperature": 72, "conditions": "sunny"}'
    )
)

# The model will see this as if a tool was called and responded
result = chat.run("response")

Additional Improvements

Agent disable support: Agents can now be disabled without deletion via disabled=True in AiAgentConfiguration
VectorQuantizer fix: to_int8() and to_int1() now return list[int] instead of bytes for proper JSON serialization
AI connection settings: Added embeddings_max_concurrent_batches parameter to all AI provider settings for controlling batch parallelism
GenAI Configuration: Added enable_tracing for debugging AI interactions and expiration_in_sec for cache control
UpdateGenAiOperation: Now supports a reset parameter to reprocess all documents from scratch
Session improvements: Revisions loaded into session now have ignore_changes=True to prevent accidental change tracking; SessionInfo exposes database_name property
ChunkingOptions validation: Added validation for overlap tokens - only supported with paragraph-based chunking methods
Python 3.13/3.14 compatibility: Fixed compatibility issues with newer Python versions
JSONL stream parsing: Added max_empty_lines_in_jsonl_stream convention for handling empty lines in streaming responses

PRs

RDBC-999 VectorQuantizer should return list[int] instead of bytes for proper serialization by @poissoncorp in #262
RDBC-1012 Implement AiAgentToolQueryOptions by @poissoncorp in #263
RDBC-1013 Rework Python VectorSearch API by @poissoncorp in #264
RDBC-978 Python 7.1.4->7.1.5 Sync by @poissoncorp in #265

Full Changelog: 7.1.4...7.1.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

7.1.5 🔥

Choose a tag to compare

Sorry, something went wrong.