lancedb · prrao87 · Apr 1, 2026 · Apr 1, 2026 · Apr 1, 2026 · Apr 1, 2026
diff --git a/docs/docs.json b/docs/docs.json
@@ -299,7 +299,14 @@
             "group": "Tutorials",
             "pages": [
               "tutorials/index",
-              "tutorials/search/index",
+              {
+                "group": "Search",
+                "expanded": true,
+                "pages": [
+                  "tutorials/search/index",
+                  "tutorials/search/multivector-needle-in-a-haystack"
+                ]
+              },
               {
                 "group": "RAG & Agents",
                 "expanded": true,

diff --git a/docs/search/multivector-search.mdx b/docs/search/multivector-search.mdx
@@ -1,27 +1,27 @@
 ---
 title: "Multivector Search"
 sidebarTitle: Multivector search
-description: Learn how to perform multivector search in LanceDB to handle multiple vector embeddings per document, ideal for late interaction models like ColBERT and ColPaLi.
+description: Learn how to perform multivector search in LanceDB to handle multiple vector embeddings per document, which is ideal for late-interaction models like ColBERT and ColPaLi.
 icon: "braille"
 ---
 
-LanceDB's multivector support enables you to store and search multiple vector embeddings for a single item. 
+LanceDB's multivector support enables you to store and search multiple vector embeddings for a single item.
 
-This capability is particularly valuable when working with late interaction models like ColBERT and ColPaLi that generate multiple embeddings per document.
+This capability is particularly valuable when working with late-interaction models like ColBERT and ColPaLi, which generate multiple embeddings per document.
 
-In this tutorial, you'll create a table with multiple vector embeddings per document and learn how to perform multivector search. [For all the code - open in Colab](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/saas_examples/python_notebook/Multivector_on_LanceDB_Cloud.ipynb)
+In this tutorial, you'll create a table with multiple vector embeddings per document and learn how to perform multivector search. [For the full code, open the Colab notebook](https://colab.research.google.com/github/lancedb/vectordb-recipes/blob/main/examples/saas_examples/python_notebook/Multivector_on_LanceDB_Cloud.ipynb).
 
 ## Multivector Support
 
-Each item in your dataset can have a column containing multiple vectors, which LanceDB can efficiently index and search. When performing a search, you can query using either a single vector embedding or multiple vector embeddings. 
+Each item in your dataset can have a column containing multiple vectors, which LanceDB can efficiently index and search. When performing a search, you can query with either a single vector embedding or multiple vector embeddings.
 
 <Warning>
 Currently, only the `cosine` metric is supported for multivector search. The vector value type can be `float16`, `float32`, or `float64`.
 </Warning>
 
 ## Computing Similarity
 
-MaxSim (Maximum Similarity) is a key concept in late interaction models that:
+MaxSim (Maximum Similarity) is a key concept in late-interaction models that:
 
 - Computes the maximum similarity between each query embedding and all document embeddings
 - Sums these maximum similarities to get the final relevance score
@@ -33,24 +33,19 @@ $$
 \text{MaxSim}(Q, D) = \sum_{i=1}^{|Q|} \max_{j=1}^{|D|} \text{sim}(q_i, d_j)
 $$
 
-Where $sim$ is the similarity function (e.g. cosine similarity).
+Where $sim$ is the similarity function (e.g., cosine similarity).
 
 $$
 Q = \{q_1, q_2, ..., q_{|Q|}\}
 $$
 
-$Q$ represents the query vector, and $D = \{d_1, d_2, ..., d_{|D|}\}$ represents the document vectors.
-
-<Warning>
-For now, you should use only the `cosine` metric for multivector search.
-The vector value type can be `float16`, `float32` or `float64`.
-</Warning>
+$Q$ represents the query embeddings, and $D = \{d_1, d_2, ..., d_{|D|}\}$ represents the document embeddings.
 
 ## Using Multivector Search
 
 ### 1. Setup
 
-Connect to LanceDB and import required libraries for data management.
+Connect to LanceDB and import the required libraries.
 
 <CodeGroup>
 ```python Python icon="python"
@@ -68,7 +63,7 @@ db = lancedb.connect(
 
 ### 2. Define Schema
 
-Define a schema that specifies a multivector field. A multivector field is a nested list structure where each document contains multiple vectors. In this case, we'll create a schema with:
+Define a schema that specifies a multivector field. A multivector field is a nested list structure in which each document contains multiple vectors. In this case, we'll create a schema with:
 
 1. An ID field as an integer (int64)
 2. A vector field that is a list of lists of float32 values
@@ -91,7 +86,7 @@ schema = pa.schema(
 
 ### 3. Generate Multivectors
 
-Generate sample data where each document contains multiple vector embeddings, which could represent different aspects or views of the same document. 
+Generate sample data where each document contains multiple vector embeddings, which can represent different aspects or views of the same document.
 
 In this example, we create **1024 documents** where each document has **2 random vectors** of **dimension 256**, simulating a real-world scenario where you might have multiple embeddings per item.
 
@@ -119,7 +114,7 @@ tbl = db.create_table("multivector_example", data=data, schema=schema)
 
 ### 5. Build an Index
 
-Only cosine similarity is supported as the distance metric for multivector search operations. 
+Only cosine similarity is supported for multivector search operations.
 For faster search, build the standard `IVF_PQ` index over your vectors:
 
 <CodeGroup>
@@ -141,7 +136,7 @@ results_single = tbl.search(query).limit(5).to_pandas()
 
 ### 7. Query Multiple Vectors
 
-With multiple vector queries, LanceDB calculates similarity using late interaction - a technique that computes relevance by finding the best matching pairs between query and document vectors. This approach provides more nuanced matching while maintaining fast retrieval speeds.
+With multiple query vectors, LanceDB calculates similarity using late interaction, a late-interaction technique that computes relevance by finding the best-matching pairs between query and document vectors. This approach provides more nuanced matching while maintaining fast retrieval speeds.
 
 <CodeGroup>
 ```python Python icon="python"
@@ -151,7 +146,7 @@ results_multi = tbl.search(query_multi).limit(5).to_pandas()
 </CodeGroup>
 
 
-Visit the [Hugging Face embedding integration](/integrations/embedding/huggingface/) page for info on embedding models.
+Visit the [Hugging Face embedding integration](/integrations/embedding/huggingface/) page for information on embedding models.
 
 ## Simple Example: ColBERT Embeddings
 
@@ -217,18 +212,18 @@ print(out[["doc_id", "text"]])
 ```
 </CodeGroup>
 
-Late interaction models implementations evolve rapidly, so it's recommended to check the latest popular models
-when trying out multivector search.
+Late-interaction model implementations evolve rapidly, so it's a good idea to check the latest popular models
+when trying multivector search.
 
 ## Advanced Example: XTR Embeddings
 
 [ConteXtualized Token Retriever (XTR)](https://arxiv.org/abs/2304.01982) is a late-interaction retrieval model that represents text as token-level vectors instead of a single embedding.
 This lets search score token-to-token matches (MaxSim), which can improve fine-grained relevance.
 
-The notebook linked below shows how to integrate XTR (ConteXtualized Token Retriever), which prioritizes critical document
-tokens during the initial retrieval stage and removes the gathering stage to significantly improve performance.
+The notebook linked below shows how to integrate XTR, which prioritizes critical document
+tokens during the initial retrieval stage and removes the gathering stage to improve performance significantly.
 By focusing on the most semantically salient tokens early in the process, XTR reduces computational complexity
-with improved recall, ensuring rapid identification of candidate documents.
+while improving recall and ensuring rapid identification of candidate documents.
 
 <Card
     icon="book"

diff --git a/docs/snippets/tables.mdx b/docs/snippets/tables.mdx
@@ -42,7 +42,7 @@ export const PyCreateTableFromArrow = "import numpy as np\nimport pyarrow as pa\
 
 export const PyCreateTableFromDicts = "data = [\n    {\"vector\": [1.1, 1.2], \"lat\": 45.5, \"long\": -122.7},\n    {\"vector\": [0.2, 1.8], \"lat\": 40.1, \"long\": -74.1},\n]\ndb = tmp_db\ndb.create_table(\"test_table\", data, mode=\"overwrite\")\ntbl = db[\"test_table\"]\ntbl.head()\n";
 
-export const PyCreateTableFromIterator = "import pyarrow as pa\n\ndef make_batches():\n    for i in range(5):\n        yield pa.RecordBatch.from_arrays(\n            [\n                pa.array(\n                    [[3.1, 4.1, 5.1, 6.1], [5.9, 26.5, 4.7, 32.8]],\n                    pa.list_(pa.float32(), 4),\n                ),\n                pa.array([\"foo\", \"bar\"]),\n                pa.array([10.0, 20.0]),\n            ],\n            [\"vector\", \"item\", \"price\"],\n        )\n\nschema = pa.schema(\n    [\n        pa.field(\"vector\", pa.list_(pa.float32(), 4)),\n        pa.field(\"item\", pa.utf8()),\n        pa.field(\"price\", pa.float32()),\n    ]\n)\ndb = tmp_db\ndb.create_table(\"batched_tale\", make_batches(), schema=schema, mode=\"overwrite\")\n";
+export const PyCreateTableFromIterator = "import pyarrow as pa\n\nschema = pa.schema(\n    [\n        pa.field(\"vector\", pa.list_(pa.float32(), 4)),\n        pa.field(\"item\", pa.utf8()),\n        pa.field(\"price\", pa.float32()),\n    ]\n)\n\ndef make_batches():\n    for i in range(5):\n        yield pa.RecordBatch.from_arrays(\n            [\n                pa.array(\n                    [[3.1, 4.1, 5.1, 6.1], [5.9, 26.5, 4.7, 32.8]],\n                    pa.list_(pa.float32(), 4),\n                ),\n                pa.array([\"foo\", \"bar\"]),\n                pa.array([10.0, 20.0]),\n            ],\n            [\"vector\", \"item\", \"price\"],\n        )\n\ndb = tmp_db\ndb.create_table(\"batched_table\", make_batches(), schema=schema, mode=\"overwrite\")\n";
 
 export const PyCreateTableFromPandas = "import pandas as pd\n\ndata = pd.DataFrame(\n    {\n        \"vector\": [[1.1, 1.2, 1.3, 1.4], [0.2, 1.8, 0.4, 3.6]],\n        \"lat\": [45.5, 40.1],\n        \"long\": [-122.7, -74.1],\n    }\n)\ndb = tmp_db\ndb.create_table(\"my_table_pandas\", data, mode=\"overwrite\")\ndb[\"my_table_pandas\"].head()\n";
 
@@ -116,6 +116,8 @@ export const PyVersioningRollback = "table.restore(version_after_mod)\nversions
 
 export const PyVersioningUpdateData = "table.update(where=\"author='Richard'\", values={\"author\": \"Richard Daniel Sanchez\"})\nrows_after_update = table.count_rows(\"author = 'Richard Daniel Sanchez'\")\nprint(f\"Rows updated to Richard Daniel Sanchez: {rows_after_update}\")\n";
 
+export const PyWriteWithConcurrency = "import pyarrow as pa\nfrom lancedb.scannable import Scannable\nfrom random import random\n\nVECTOR_DIM = 4\n\nschema = pa.schema(\n    [\n        pa.field(\"id\", pa.int64()),\n        pa.field(\"vector\", pa.list_(pa.float32(), VECTOR_DIM)),\n    ]\n)\n\ndef make_batch(batch_idx: int, rows_per_batch: int) -> pa.RecordBatch:\n    start = batch_idx * rows_per_batch\n    stop = start + rows_per_batch\n    row_ids = list(range(start, stop))\n    vectors = pa.array(\n        [[random() for _ in range(VECTOR_DIM)] for _ in row_ids],\n        type=pa.list_(pa.float32(), VECTOR_DIM),\n    )\n    return pa.RecordBatch.from_arrays(\n        [\n            pa.array(row_ids, type=pa.int64()),\n            vectors,\n        ],\n        schema=schema,\n    )\n\ndef make_batch_reader(\n    num_batches: int, rows_per_batch: int\n) -> pa.RecordBatchReader:\n    return pa.RecordBatchReader.from_batches(\n        schema,\n        (make_batch(batch_idx, rows_per_batch) for batch_idx in range(num_batches)),\n    )\n\ndef make_large_scannable(num_batches: int, rows_per_batch: int) -> Scannable:\n    total_rows = num_batches * rows_per_batch\n    return Scannable(\n        schema=schema,\n        num_rows=total_rows,\n        reader=lambda: make_batch_reader(num_batches, rows_per_batch),\n    )\n\ndb = tmp_db\ntable = db.create_table(\n    \"bulk_ingest_concurrent\",\n    make_large_scannable(num_batches=1000, rows_per_batch=10000),\n    mode=\"overwrite\",\n)\n";
+
 export const TsAddColumnsCalculated = "// Add a discounted price column (10% discount)\nawait schemaAddTable.addColumns([\n  {\n    name: \"discounted_price\",\n    valueSql: \"cast((price * 0.9) as float)\",\n  },\n]);\n";
 
 export const TsAddColumnsDefaultValues = "// Add a stock status column with default value\nawait schemaAddTable.addColumns([\n  {\n    name: \"in_stock\",\n    valueSql: \"cast(true as boolean)\",\n  },\n]);\n";

diff --git a/docs/static/assets/images/search/multivector/multivector-1.png b/docs/static/assets/images/search/multivector/multivector-1.png
diff --git a/docs/static/assets/images/search/multivector/multivector-2.png b/docs/static/assets/images/search/multivector/multivector-2.png
diff --git a/docs/static/assets/images/search/multivector/multivector-3.png b/docs/static/assets/images/search/multivector/multivector-3.png
diff --git a/docs/static/assets/images/search/multivector/multivector-4.png b/docs/static/assets/images/search/multivector/multivector-4.png
diff --git a/docs/static/assets/images/search/multivector/multivector-5.png b/docs/static/assets/images/search/multivector/multivector-5.png
diff --git a/docs/static/assets/images/search/multivector/multivector-6.png b/docs/static/assets/images/search/multivector/multivector-6.png
diff --git a/docs/tables/create.mdx b/docs/tables/create.mdx
@@ -42,7 +42,7 @@ In LanceDB, tables store records with a defined schema that specifies column nam
 - PyArrow schemas for explicit schema control
 - `LanceModel` for Pydantic-based validation
 
-## Create a LanceDB Table
+## Create a table with data
 
 Initialize a LanceDB connection and create a table
 
@@ -217,9 +217,9 @@ for a `created_at` field.
 
 When you run this code it, should raise the `ValidationError`.
 
-### Use Iterators / Write Large Datasets
+### From Batch Iterators
 
-For large ingests, prefer batching instead of adding one row at a time. Python and Rust can create a table directly from Arrow batch iterators or readers. In TypeScript, the practical pattern today is to create an empty table and append Arrow batches in chunks.
+For bulk ingestion on large datasets, prefer batching instead of adding one row at a time. Python and Rust can create a table directly from Arrow batch iterators or readers. In TypeScript, the practical pattern today is to create an empty table and append Arrow batches in chunks.
 
 <CodeGroup>
     <CodeBlock filename="Python" language="Python" icon="python">
@@ -235,25 +235,23 @@ For large ingests, prefer batching instead of adding one row at a time. Python a
     </CodeBlock>
 </CodeGroup>
 
+Use this pattern when:
+
+- Your source data already arrives in Arrow batches, readers, datasets, or streams.
+- Materializing the entire ingest as one giant in-memory list or array would be too expensive.
+- You want to control chunk size explicitly during ingestion.
+
 Python can also consume iterators of other supported types like Pandas DataFrames or Python lists.
 
-## Open existing tables
+### Write with Concurrency
 
-If you forget the name of your table, you can always get a listing of all table names.
+For Python users who want to speed up bulk ingest jobs, it is usually better to write from Arrow-native sources that already produce batches, such as readers, datasets, or scanners, instead of first materializing everything as one large Python list.
 
-<CodeGroup>
-    <CodeBlock filename="Python" language="Python" icon="python">
-    {OpenExistingTable}
-    </CodeBlock>
+This is most useful when you are writing large amounts of data from an existing Arrow pipeline or another batch-oriented source.
 
-    <CodeBlock filename="TypeScript" language="TypeScript" icon="square-js">
-    {TsOpenExistingTable}
-    </CodeBlock>
+The current codebase also contains a lower-level ingest mechanism for describing a batch source together with extra metadata such as row counts and retry behavior. However, that path is not accepted by the released Python `create_table(...)` and `add(...)` workflow in `lancedb==0.30.0`, so we are not showing it as a docs example yet.
 
-    <CodeBlock filename="Rust" language="Rust" icon="rust">
-    {RsOpenExistingTable}
-    </CodeBlock>
-</CodeGroup>
+In Rust, the same lower-level ingest mechanism is available, but the common batch-reader example above is usually the better starting point unless you specifically need to define your own batch source or provide size and retry hints. In TypeScript, this lower-level mechanism is not exposed publicly, so chunked Arrow batch writes remain the recommended pattern.
 
 ## Create empty table
 You can create an empty table for scenarios where you want to add data to the table later.
@@ -289,6 +287,25 @@ that has been extended to support LanceDB specific types like `Vector`.
 Once the empty table has been created, you can append to it or modify its contents,
 as explained in the [updating and modifying tables](/tables/update) section.
 
+## Open an existing table
+
+You can open an existing table by specifying the name of the table to the `open_table` / `openTable` method.
+If you forget the name of your table, you can always get a listing of all table names.
+
+<CodeGroup>
+    <CodeBlock filename="Python" language="Python" icon="python">
+    {OpenExistingTable}
+    </CodeBlock>
+
+    <CodeBlock filename="TypeScript" language="TypeScript" icon="square-js">
+    {TsOpenExistingTable}
+    </CodeBlock>
+
+    <CodeBlock filename="Rust" language="Rust" icon="rust">
+    {RsOpenExistingTable}
+    </CodeBlock>
+</CodeGroup>
+
 ## Drop a table
 
 Use the `drop_table()` method on the database to remove a table.