Skip to content

7.1.5 🔥

Choose a tag to compare

@poissoncorp poissoncorp released this 02 Feb 09:56
· 4 commits to v7.1 since this release
4cd66c4

7.1.5 brings a major overhaul of the Vector Search API with cleaner and more functional methods, adds Embeddings Generation Tasks for automatic vector embedding creation, and introduces new AI agent capabilities.

If you're new to AI in RavenDB, start here for an in-depth introduction.

PyPi link: https://pypi.org/project/ravendb/7.1.5/

Highlights

Embeddings Generation Tasks

New Embeddings Generation Tasks let you automatically generate vector embeddings from document content. Define which document properties to embed, configure chunking strategies, and RavenDB handles the rest - calling the AI provider, caching embeddings, and keeping them in sync as documents change.

Generated embeddings are stored in dedicated collections (@embeddings) and cached (@embeddings-cache) to avoid redundant provider calls. The task processes documents continuously as they change. You can learn more about embedding generation tasks here.

from ravendb.documents.operations.ai import (
    EmbeddingsGenerationConfiguration,
    EmbeddingPathConfiguration,
    ChunkingOptions,
    ChunkingMethod,
    AddEmbeddingsGenerationOperation,
)

# Configure chunking - how text is split before sending to the AI provider
chunking = ChunkingOptions(
    chunking_method=ChunkingMethod.PLAIN_TEXT_SPLIT_LINES,
    max_tokens_per_chunk=2048,
)

# Configure the Embeddings Generation task
config = EmbeddingsGenerationConfiguration(
    name="PostEmbeddings",
    identifier="post-embeddings",
    collection="Posts",
    connection_string_name="openai-embeddings",
    embeddings_path_configurations=[
        EmbeddingPathConfiguration(path="Content", chunking_options=chunking),
        EmbeddingPathConfiguration(path="Title", chunking_options=chunking),
    ],
    chunking_options_for_querying=chunking,  # Used when embedding search terms
)

# Deploy the task
result = store.maintenance.send(AddEmbeddingsGenerationOperation(config))
print(f"Task created with ID: {result.task_id}")

Vector Search API Rework

The Vector Search API has been improved with cleaner, more intuitive method families. There are no breaking changes. The new API provides dedicated methods for each embedding type and use case:

Core methods by embedding type:

# Float32 embeddings (default)
session.query(Product).vector_search("Embedding", vector)

# Int8 quantized embeddings
session.query(Product).vector_search_i8("Embedding", int8_vector)

# Binary (int1) quantized embeddings
session.query(Product).vector_search_i1("Embedding", binary_vector)

# Text-based search (uses embeddings generation task)
session.query(Product).vector_search_text(
    "Embedding",
    "semantic search query",
    embedding_generation_task_identifier="my-embeddings-task"
)

Index field methods for querying pre-indexed embeddings:

# Query against index embedding fields
session.query_index("ProductIndex").vector_search_with_field("EmbeddingField", vector)
session.query_index("ProductIndex").vector_search_with_i8_field("EmbeddingField", int8_vector)
session.query_index("ProductIndex").vector_search_with_i1_field("EmbeddingField", binary_vector)
session.query_index("ProductIndex").vector_search_with_text_field("EmbeddingField", "search text")

Document-based similarity search:

# Find similar documents based on another document's embeddings
session.query(Product).vector_search_text_for_document(
    "Embedding",
    document_id="products/123-A",
    embedding_generation_task_identifier="my-embeddings-task"
)

Base64-encoded vector support:

# For base64-encoded vectors stored in documents
session.query(Product).vector_search_with_base64("Embedding", base64_or_float_vector)
session.query(Product).vector_search_with_base64_i8("Embedding", base64_or_int8_vector)
session.query(Product).vector_search_with_base64_i1("Embedding", base64_or_binary_vector)

All methods support optional parameters: minimum_similarity, number_of_candidates, is_exact, and where applicable, target_quantization.

Deprecated methods: The old vector_search_f32_i8, vector_search_f32_i1, vector_search_text_using_task, and similar methods are now deprecated. Use the new unified methods with appropriate parameters instead.

AI Agent Query Tool Options

Query tools now support additional options for controlling LLM access and initial context:

from ravendb.documents.operations.ai.agents import (
    AiAgentConfiguration,
    AiAgentToolQuery,
    AiAgentToolQueryOptions,
)

config = AiAgentConfiguration(
    identifier="my-agent",
    name="Support Agent",
    system_prompt="You help customers with their orders.",
    query_tools=[
        AiAgentToolQuery(
            name="GetRecentOrders",
            description="Retrieves recent orders for a customer",
            query="from Orders where CustomerId = $customerId order by OrderDate desc limit 5",
            options=AiAgentToolQueryOptions(
                add_to_initial_context=True,  # Run query at conversation start
                allow_model_queries=False,    # Prevent LLM from invoking directly
            ),
        ),
    ],
)

Artificial Actions for AI Agent Conversations

You can now inject artificial tool call responses into AI agent conversations. This is useful for providing context from external systems, simulating tool responses during testing, or pre-populating conversation context:

from ravendb.documents.ai import AiAgentArtificialActionResponse

# Inject an artificial action with its response into the conversation
chat.add_artificial_action_with_response(
    AiAgentArtificialActionResponse(
        tool_id="weather_lookup_123",
        content='{"temperature": 72, "conditions": "sunny"}'
    )
)

# The model will see this as if a tool was called and responded
result = chat.run("response")

Additional Improvements

  • Agent disable support: Agents can now be disabled without deletion via disabled=True in AiAgentConfiguration
  • VectorQuantizer fix: to_int8() and to_int1() now return list[int] instead of bytes for proper JSON serialization
  • AI connection settings: Added embeddings_max_concurrent_batches parameter to all AI provider settings for controlling batch parallelism
  • GenAI Configuration: Added enable_tracing for debugging AI interactions and expiration_in_sec for cache control
  • UpdateGenAiOperation: Now supports a reset parameter to reprocess all documents from scratch
  • Session improvements: Revisions loaded into session now have ignore_changes=True to prevent accidental change tracking; SessionInfo exposes database_name property
  • ChunkingOptions validation: Added validation for overlap tokens - only supported with paragraph-based chunking methods
  • Python 3.13/3.14 compatibility: Fixed compatibility issues with newer Python versions
  • JSONL stream parsing: Added max_empty_lines_in_jsonl_stream convention for handling empty lines in streaming responses

PRs

Full Changelog: 7.1.4...7.1.5