Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -299,7 +299,14 @@
"group": "Tutorials",
"pages": [
"tutorials/index",
"tutorials/search/index",
{
"group": "Search",
"expanded": true,
"pages": [
"tutorials/search/index",
"tutorials/search/multivector-needle-in-a-haystack"
]
},
{
"group": "RAG & Agents",
"expanded": true,
Expand Down
43 changes: 19 additions & 24 deletions docs/search/multivector-search.mdx
Original file line number Diff line number Diff line change
@@ -1,27 +1,27 @@
---
title: "Multivector Search"
sidebarTitle: Multivector search
description: Learn how to perform multivector search in LanceDB to handle multiple vector embeddings per document, ideal for late interaction models like ColBERT and ColPaLi.
description: Learn how to perform multivector search in LanceDB to handle multiple vector embeddings per document, which is ideal for late-interaction models like ColBERT and ColPaLi.
icon: "braille"
---

LanceDB's multivector support enables you to store and search multiple vector embeddings for a single item.
LanceDB's multivector support enables you to store and search multiple vector embeddings for a single item.

This capability is particularly valuable when working with late interaction models like ColBERT and ColPaLi that generate multiple embeddings per document.
This capability is particularly valuable when working with late-interaction models like ColBERT and ColPaLi, which generate multiple embeddings per document.

In this tutorial, you'll create a table with multiple vector embeddings per document and learn how to perform multivector search. [For all the code - open in Colab](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/saas_examples/python_notebook/Multivector_on_LanceDB_Cloud.ipynb)
In this tutorial, you'll create a table with multiple vector embeddings per document and learn how to perform multivector search. [For the full code, open the Colab notebook](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/saas_examples/python_notebook/Multivector_on_LanceDB_Cloud.ipynb).

## Multivector Support

Each item in your dataset can have a column containing multiple vectors, which LanceDB can efficiently index and search. When performing a search, you can query using either a single vector embedding or multiple vector embeddings.
Each item in your dataset can have a column containing multiple vectors, which LanceDB can efficiently index and search. When performing a search, you can query with either a single vector embedding or multiple vector embeddings.

<Warning>
Currently, only the `cosine` metric is supported for multivector search. The vector value type can be `float16`, `float32`, or `float64`.
</Warning>

## Computing Similarity

MaxSim (Maximum Similarity) is a key concept in late interaction models that:
MaxSim (Maximum Similarity) is a key concept in late-interaction models that:

- Computes the maximum similarity between each query embedding and all document embeddings
- Sums these maximum similarities to get the final relevance score
Expand All @@ -33,24 +33,19 @@ $$
\text{MaxSim}(Q, D) = \sum_{i=1}^{|Q|} \max_{j=1}^{|D|} \text{sim}(q_i, d_j)
$$

Where $sim$ is the similarity function (e.g. cosine similarity).
Where $sim$ is the similarity function (e.g., cosine similarity).

$$
Q = \{q_1, q_2, ..., q_{|Q|}\}
$$

$Q$ represents the query vector, and $D = \{d_1, d_2, ..., d_{|D|}\}$ represents the document vectors.

<Warning>
For now, you should use only the `cosine` metric for multivector search.
The vector value type can be `float16`, `float32` or `float64`.
</Warning>
$Q$ represents the query embeddings, and $D = \{d_1, d_2, ..., d_{|D|}\}$ represents the document embeddings.

## Using Multivector Search

### 1. Setup

Connect to LanceDB and import required libraries for data management.
Connect to LanceDB and import the required libraries.

<CodeGroup>
```python Python icon="python"
Expand All @@ -68,7 +63,7 @@ db = lancedb.connect(

### 2. Define Schema

Define a schema that specifies a multivector field. A multivector field is a nested list structure where each document contains multiple vectors. In this case, we'll create a schema with:
Define a schema that specifies a multivector field. A multivector field is a nested list structure in which each document contains multiple vectors. In this case, we'll create a schema with:

1. An ID field as an integer (int64)
2. A vector field that is a list of lists of float32 values
Expand All @@ -91,7 +86,7 @@ schema = pa.schema(

### 3. Generate Multivectors

Generate sample data where each document contains multiple vector embeddings, which could represent different aspects or views of the same document.
Generate sample data where each document contains multiple vector embeddings, which can represent different aspects or views of the same document.

In this example, we create **1024 documents** where each document has **2 random vectors** of **dimension 256**, simulating a real-world scenario where you might have multiple embeddings per item.

Expand Down Expand Up @@ -119,7 +114,7 @@ tbl = db.create_table("multivector_example", data=data, schema=schema)

### 5. Build an Index

Only cosine similarity is supported as the distance metric for multivector search operations.
Only cosine similarity is supported for multivector search operations.
For faster search, build the standard `IVF_PQ` index over your vectors:

<CodeGroup>
Expand All @@ -141,7 +136,7 @@ results_single = tbl.search(query).limit(5).to_pandas()

### 7. Query Multiple Vectors

With multiple vector queries, LanceDB calculates similarity using late interaction - a technique that computes relevance by finding the best matching pairs between query and document vectors. This approach provides more nuanced matching while maintaining fast retrieval speeds.
With multiple query vectors, LanceDB calculates similarity using late interaction, a late-interaction technique that computes relevance by finding the best-matching pairs between query and document vectors. This approach provides more nuanced matching while maintaining fast retrieval speeds.

<CodeGroup>
```python Python icon="python"
Expand All @@ -151,7 +146,7 @@ results_multi = tbl.search(query_multi).limit(5).to_pandas()
</CodeGroup>


Visit the [Hugging Face embedding integration](/integrations/embedding/huggingface/) page for info on embedding models.
Visit the [Hugging Face embedding integration](/integrations/embedding/huggingface/) page for information on embedding models.

## Simple Example: ColBERT Embeddings

Expand Down Expand Up @@ -217,18 +212,18 @@ print(out[["doc_id", "text"]])
```
</CodeGroup>

Late interaction models implementations evolve rapidly, so it's recommended to check the latest popular models
when trying out multivector search.
Late-interaction model implementations evolve rapidly, so it's a good idea to check the latest popular models
when trying multivector search.

## Advanced Example: XTR Embeddings

[ConteXtualized Token Retriever (XTR)](https://arxiv.org/abs/2304.01982) is a late-interaction retrieval model that represents text as token-level vectors instead of a single embedding.
This lets search score token-to-token matches (MaxSim), which can improve fine-grained relevance.

The notebook linked below shows how to integrate XTR (ConteXtualized Token Retriever), which prioritizes critical document
tokens during the initial retrieval stage and removes the gathering stage to significantly improve performance.
The notebook linked below shows how to integrate XTR, which prioritizes critical document
tokens during the initial retrieval stage and removes the gathering stage to improve performance significantly.
By focusing on the most semantically salient tokens early in the process, XTR reduces computational complexity
with improved recall, ensuring rapid identification of candidate documents.
while improving recall and ensuring rapid identification of candidate documents.

<Card
icon="book"
Expand Down
4 changes: 3 additions & 1 deletion docs/snippets/tables.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ export const PyCreateTableFromArrow = "import numpy as np\nimport pyarrow as pa\

export const PyCreateTableFromDicts = "data = [\n {\"vector\": [1.1, 1.2], \"lat\": 45.5, \"long\": -122.7},\n {\"vector\": [0.2, 1.8], \"lat\": 40.1, \"long\": -74.1},\n]\ndb = tmp_db\ndb.create_table(\"test_table\", data, mode=\"overwrite\")\ntbl = db[\"test_table\"]\ntbl.head()\n";

export const PyCreateTableFromIterator = "import pyarrow as pa\n\ndef make_batches():\n for i in range(5):\n yield pa.RecordBatch.from_arrays(\n [\n pa.array(\n [[3.1, 4.1, 5.1, 6.1], [5.9, 26.5, 4.7, 32.8]],\n pa.list_(pa.float32(), 4),\n ),\n pa.array([\"foo\", \"bar\"]),\n pa.array([10.0, 20.0]),\n ],\n [\"vector\", \"item\", \"price\"],\n )\n\nschema = pa.schema(\n [\n pa.field(\"vector\", pa.list_(pa.float32(), 4)),\n pa.field(\"item\", pa.utf8()),\n pa.field(\"price\", pa.float32()),\n ]\n)\ndb = tmp_db\ndb.create_table(\"batched_tale\", make_batches(), schema=schema, mode=\"overwrite\")\n";
export const PyCreateTableFromIterator = "import pyarrow as pa\n\nschema = pa.schema(\n [\n pa.field(\"vector\", pa.list_(pa.float32(), 4)),\n pa.field(\"item\", pa.utf8()),\n pa.field(\"price\", pa.float32()),\n ]\n)\n\ndef make_batches():\n for i in range(5):\n yield pa.RecordBatch.from_arrays(\n [\n pa.array(\n [[3.1, 4.1, 5.1, 6.1], [5.9, 26.5, 4.7, 32.8]],\n pa.list_(pa.float32(), 4),\n ),\n pa.array([\"foo\", \"bar\"]),\n pa.array([10.0, 20.0]),\n ],\n [\"vector\", \"item\", \"price\"],\n )\n\ndb = tmp_db\ndb.create_table(\"batched_table\", make_batches(), schema=schema, mode=\"overwrite\")\n";

export const PyCreateTableFromPandas = "import pandas as pd\n\ndata = pd.DataFrame(\n {\n \"vector\": [[1.1, 1.2, 1.3, 1.4], [0.2, 1.8, 0.4, 3.6]],\n \"lat\": [45.5, 40.1],\n \"long\": [-122.7, -74.1],\n }\n)\ndb = tmp_db\ndb.create_table(\"my_table_pandas\", data, mode=\"overwrite\")\ndb[\"my_table_pandas\"].head()\n";

Expand Down Expand Up @@ -116,6 +116,8 @@ export const PyVersioningRollback = "table.restore(version_after_mod)\nversions

export const PyVersioningUpdateData = "table.update(where=\"author='Richard'\", values={\"author\": \"Richard Daniel Sanchez\"})\nrows_after_update = table.count_rows(\"author = 'Richard Daniel Sanchez'\")\nprint(f\"Rows updated to Richard Daniel Sanchez: {rows_after_update}\")\n";

export const PyWriteWithConcurrency = "import pyarrow as pa\nfrom lancedb.scannable import Scannable\nfrom random import random\n\nVECTOR_DIM = 4\n\nschema = pa.schema(\n [\n pa.field(\"id\", pa.int64()),\n pa.field(\"vector\", pa.list_(pa.float32(), VECTOR_DIM)),\n ]\n)\n\ndef make_batch(batch_idx: int, rows_per_batch: int) -> pa.RecordBatch:\n start = batch_idx * rows_per_batch\n stop = start + rows_per_batch\n row_ids = list(range(start, stop))\n vectors = pa.array(\n [[random() for _ in range(VECTOR_DIM)] for _ in row_ids],\n type=pa.list_(pa.float32(), VECTOR_DIM),\n )\n return pa.RecordBatch.from_arrays(\n [\n pa.array(row_ids, type=pa.int64()),\n vectors,\n ],\n schema=schema,\n )\n\ndef make_batch_reader(\n num_batches: int, rows_per_batch: int\n) -> pa.RecordBatchReader:\n return pa.RecordBatchReader.from_batches(\n schema,\n (make_batch(batch_idx, rows_per_batch) for batch_idx in range(num_batches)),\n )\n\ndef make_large_scannable(num_batches: int, rows_per_batch: int) -> Scannable:\n total_rows = num_batches * rows_per_batch\n return Scannable(\n schema=schema,\n num_rows=total_rows,\n reader=lambda: make_batch_reader(num_batches, rows_per_batch),\n )\n\ndb = tmp_db\ntable = db.create_table(\n \"bulk_ingest_concurrent\",\n make_large_scannable(num_batches=1000, rows_per_batch=10000),\n mode=\"overwrite\",\n)\n";

export const TsAddColumnsCalculated = "// Add a discounted price column (10% discount)\nawait schemaAddTable.addColumns([\n {\n name: \"discounted_price\",\n valueSql: \"cast((price * 0.9) as float)\",\n },\n]);\n";

export const TsAddColumnsDefaultValues = "// Add a stock status column with default value\nawait schemaAddTable.addColumns([\n {\n name: \"in_stock\",\n valueSql: \"cast(true as boolean)\",\n },\n]);\n";
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
49 changes: 33 additions & 16 deletions docs/tables/create.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ In LanceDB, tables store records with a defined schema that specifies column nam
- PyArrow schemas for explicit schema control
- `LanceModel` for Pydantic-based validation

## Create a LanceDB Table
## Create a table with data

Initialize a LanceDB connection and create a table

Expand Down Expand Up @@ -217,9 +217,9 @@ for a `created_at` field.

When you run this code it, should raise the `ValidationError`.

### Use Iterators / Write Large Datasets
### From Batch Iterators

For large ingests, prefer batching instead of adding one row at a time. Python and Rust can create a table directly from Arrow batch iterators or readers. In TypeScript, the practical pattern today is to create an empty table and append Arrow batches in chunks.
For bulk ingestion on large datasets, prefer batching instead of adding one row at a time. Python and Rust can create a table directly from Arrow batch iterators or readers. In TypeScript, the practical pattern today is to create an empty table and append Arrow batches in chunks.

<CodeGroup>
<CodeBlock filename="Python" language="Python" icon="python">
Expand All @@ -235,25 +235,23 @@ For large ingests, prefer batching instead of adding one row at a time. Python a
</CodeBlock>
</CodeGroup>

Use this pattern when:

- Your source data already arrives in Arrow batches, readers, datasets, or streams.
- Materializing the entire ingest as one giant in-memory list or array would be too expensive.
- You want to control chunk size explicitly during ingestion.

Python can also consume iterators of other supported types like Pandas DataFrames or Python lists.

## Open existing tables
### Write with Concurrency

If you forget the name of your table, you can always get a listing of all table names.
For Python users who want to speed up bulk ingest jobs, it is usually better to write from Arrow-native sources that already produce batches, such as readers, datasets, or scanners, instead of first materializing everything as one large Python list.

<CodeGroup>
<CodeBlock filename="Python" language="Python" icon="python">
{OpenExistingTable}
</CodeBlock>
This is most useful when you are writing large amounts of data from an existing Arrow pipeline or another batch-oriented source.

<CodeBlock filename="TypeScript" language="TypeScript" icon="square-js">
{TsOpenExistingTable}
</CodeBlock>
The current codebase also contains a lower-level ingest mechanism for describing a batch source together with extra metadata such as row counts and retry behavior. However, that path is not accepted by the released Python `create_table(...)` and `add(...)` workflow in `lancedb==0.30.0`, so we are not showing it as a docs example yet.

<CodeBlock filename="Rust" language="Rust" icon="rust">
{RsOpenExistingTable}
</CodeBlock>
</CodeGroup>
In Rust, the same lower-level ingest mechanism is available, but the common batch-reader example above is usually the better starting point unless you specifically need to define your own batch source or provide size and retry hints. In TypeScript, this lower-level mechanism is not exposed publicly, so chunked Arrow batch writes remain the recommended pattern.

## Create empty table
You can create an empty table for scenarios where you want to add data to the table later.
Expand Down Expand Up @@ -289,6 +287,25 @@ that has been extended to support LanceDB specific types like `Vector`.
Once the empty table has been created, you can append to it or modify its contents,
as explained in the [updating and modifying tables](/tables/update) section.

## Open an existing table

You can open an existing table by specifying the name of the table to the `open_table` / `openTable` method.
If you forget the name of your table, you can always get a listing of all table names.

<CodeGroup>
<CodeBlock filename="Python" language="Python" icon="python">
{OpenExistingTable}
</CodeBlock>

<CodeBlock filename="TypeScript" language="TypeScript" icon="square-js">
{TsOpenExistingTable}
</CodeBlock>

<CodeBlock filename="Rust" language="Rust" icon="rust">
{RsOpenExistingTable}
</CodeBlock>
</CodeGroup>

## Drop a table

Use the `drop_table()` method on the database to remove a table.
Expand Down
Loading
Loading