Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/components/data-sources/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ We handle the complexity of loading unstructured data from these data sources, a
<Card title="Youtube Video" href="/components/data-sources/youtube-video" />
<Card title="Youtube Channel" href="/components/data-sources/youtube-channel" />
<Card title="Youtube Search" href="/components/data-sources/youtube-search" />
<Card title="TwelveLabs Video" href="/components/data-sources/twelvelabs-video" />
<Card title="DOCX file" href="/components/data-sources/docx" />
<Card title="PPT file" href="/components/data-sources/ppt" />
<Card title="Excel file" href="/components/data-sources/excel" />
Expand Down
47 changes: 47 additions & 0 deletions docs/components/data-sources/twelvelabs-video.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
title: '🎬 TwelveLabs Video'
---

To add any video to your app, use the `TwelveLabsVideoLoader`. It analyses the video with the [TwelveLabs](https://twelvelabs.io) Pegasus video understanding model — which watches the visuals, motion and audio — and loads the resulting description into your RAG application. This lets you run RAG over video content directly from a public URL, without a separate transcription step.

- Sign up for an account with TwelveLabs and grab an API key. There's a generous free tier at [twelvelabs.io](https://twelvelabs.io).

- Set the key in the environment variable `TWELVELABS_API_KEY` (or pass it directly to the loader).

```bash
TWELVELABS_API_KEY="<YOUR_KEY>"
```

## Install TwelveLabs addon

```bash
npm install @llm-tools/embedjs-twelvelabs
```

## Usage

```ts
import { RAGApplicationBuilder } from '@llm-tools/embedjs';
import { OpenAiEmbeddings } from '@llm-tools/embedjs-openai';
import { HNSWDb } from '@llm-tools/embedjs-hnswlib';
import { TwelveLabsVideoLoader } from '@llm-tools/embedjs-twelvelabs';

const app = await new RAGApplicationBuilder()
.setModel(SIMPLE_MODELS.OPENAI_GPT4_O)
.setEmbeddingModel(new OpenAiEmbeddings())
.setVectorDatabase(new HNSWDb())
.build();

app.addLoader(new TwelveLabsVideoLoader({ url: 'https://example.com/video.mp4' }))
```

You can customise the analysis with an optional `prompt`, `model` (`pegasus1.2` or `pegasus1.5`) and `maxTokens`:

```ts
app.addLoader(new TwelveLabsVideoLoader({
url: 'https://example.com/video.mp4',
prompt: 'Summarize the key moments in this video.',
model: 'pegasus1.2',
maxTokens: 2048,
}))
```
1 change: 1 addition & 0 deletions docs/components/embeddings/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ EmbedJs supports several embedding models from the following providers:
<Card title="Huggingface" href="/components/embeddings/huggingface" />
<Card title="Ollama" href="/components/embeddings/ollama" />
<Card title="LlamaCpp" href="/components/embeddings/llama-cpp" />
<Card title="TwelveLabs (Marengo)" href="/components/embeddings/twelvelabs" />
</CardGroup>

<br/ >
Expand Down
38 changes: 38 additions & 0 deletions docs/components/embeddings/twelvelabs.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: 'TwelveLabs (Marengo)'
---

The library supports usage of the [TwelveLabs](https://twelvelabs.io) `marengo3.0` multimodal embedding model out of the box. Marengo embeds text, image, audio and video into a single shared latent space, which makes it a great fit for retrieval over video libraries. This model returns vectors with dimension 512.

Here's what you have to do to use it -

- Sign up for an account with TwelveLabs and grab an API key. There's a generous free tier at [twelvelabs.io](https://twelvelabs.io).

- Set the key in the environment variable `TWELVELABS_API_KEY` (or pass it directly to the constructor).

```bash
TWELVELABS_API_KEY="<YOUR_KEY>"
```

## Install TwelveLabs addon

```bash
npm install @llm-tools/embedjs-twelvelabs
```

## Usage

```ts
import { RAGApplicationBuilder } from '@llm-tools/embedjs';
import { MarengoEmbeddings } from '@llm-tools/embedjs-twelvelabs';
import { HNSWDb } from '@llm-tools/embedjs-hnswlib';

const app = await new RAGApplicationBuilder()
.setEmbeddingModel(new MarengoEmbeddings())
```

You can also pass the API key explicitly:

```ts
new MarengoEmbeddings({ apiKey: 'YOUR_KEY' })
```
4 changes: 3 additions & 1 deletion docs/mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@
"components/data-sources/xml",
"components/data-sources/directory",
"components/data-sources/image",
"components/data-sources/twelvelabs-video",
"components/data-sources/custom"
]
}
Expand Down Expand Up @@ -139,7 +140,8 @@
"components/embeddings/cohere",
"components/embeddings/ollama",
"components/embeddings/huggingface",
"components/embeddings/vertexai"
"components/embeddings/vertexai",
"components/embeddings/twelvelabs"
]
}
]
Expand Down
13 changes: 13 additions & 0 deletions models/embedjs-twelvelabs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# embedjs-twelvelabs

<p>
<a href="https://www.npmjs.com/package/@llm-tools/embedjs" target="_blank"><img alt="NPM Version" src="https://img.shields.io/npm/v/%40llm-tools/embedjs?style=for-the-badge"></a>
<a href="https://www.npmjs.com/package/@llm-tools/embedjs" target="_blank"><img alt="License" src="https://img.shields.io/npm/l/%40llm-tools%2Fembedjs?style=for-the-badge"></a>
</p>

This package extends [embedJs](https://www.npmjs.com/package/@llm-tools/embedjs) with [TwelveLabs](https://twelvelabs.io) video understanding:

- **`MarengoEmbeddings`** — multimodal embeddings from the Marengo model (512 dimensions), usable as a drop in embedding model.
- **`TwelveLabsVideoLoader`** — a data source that analyses a video with the Pegasus model and loads the resulting description into your RAG application.

Refer to the [embedJs documentation](https://llm-tools.mintlify.app) for more details.
20 changes: 20 additions & 0 deletions models/embedjs-twelvelabs/eslint.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
import baseConfig from '../../eslint.config.js';
import parser from '@nx/eslint-plugin';

export default [
...baseConfig,
{
files: ['**/*.json'],
rules: {
'@nx/dependency-checks': [
'error',
{
ignoredFiles: ['{projectRoot}/eslint.config.{js,cjs,mjs}'],
},
],
},
languageOptions: {
parser,
},
},
];
40 changes: 40 additions & 0 deletions models/embedjs-twelvelabs/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{
"name": "@llm-tools/embedjs-twelvelabs",
"version": "0.1.31",
"description": "Enable usage of TwelveLabs Marengo embeddings and Pegasus video understanding with embedjs",
"dependencies": {
"@langchain/textsplitters": "^1.0.0",
"@llm-tools/embedjs-interfaces": "0.1.31",
"@llm-tools/embedjs-utils": "0.1.31",
"debug": "^4.4.3",
"md5": "^2.3.0",
"twelvelabs-js": "^1.2.8"
},
"type": "module",
"main": "./src/index.js",
"license": "Apache-2.0",
"publishConfig": {
"access": "public"
},
"keywords": [
"llm",
"ai",
"twelvelabs",
"marengo",
"pegasus",
"video",
"multimodal",
"embeddings",
"vectorstores",
"rag"
],
"author": "K V Adhityan",
"bugs": {
"url": "https://github.com/llm-tools/embedjs/issues"
},
"homepage": "https://github.com/llm-tools/embedjs#readme",
"repository": {
"type": "git",
"url": "git+https://github.com/llm-tools/embedjs.git"
}
}
19 changes: 19 additions & 0 deletions models/embedjs-twelvelabs/project.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{
"name": "embedjs-twelvelabs",
"$schema": "../../node_modules/nx/schemas/project-schema.json",
"sourceRoot": "models/embedjs-twelvelabs/src",
"projectType": "library",
"tags": [],
"targets": {
"build": {
"executor": "@nx/js:tsc",
"outputs": ["{options.outputPath}"],
"options": {
"outputPath": "dist/embedjs-twelvelabs",
"main": "models/embedjs-twelvelabs/src/index.ts",
"tsConfig": "models/embedjs-twelvelabs/tsconfig.json",
"assets": ["models/embedjs-twelvelabs/*.md"]
}
}
}
}
2 changes: 2 additions & 0 deletions models/embedjs-twelvelabs/src/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
export * from './marengo-embeddings.js';
export * from './twelvelabs-video-loader.js';
63 changes: 63 additions & 0 deletions models/embedjs-twelvelabs/src/marengo-embeddings.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
import { TwelveLabs } from 'twelvelabs-js';
import { BaseEmbeddings } from '@llm-tools/embedjs-interfaces';

/**
* Multimodal embeddings backed by TwelveLabs' Marengo model. Marengo embeds text,
* image, audio and video into a single shared latent space, which makes it a good
* fit for RAG over video libraries. This class exposes the text side of that space
* so it can be used as a drop in embedding model for embedJs.
*
* The Marengo `marengo3.0` model returns 512 dimensional vectors.
*/
export class MarengoEmbeddings extends BaseEmbeddings {
private readonly client: TwelveLabs;
private readonly model: string;
private readonly dimensions: number;

constructor({
apiKey,
model = 'marengo3.0',
dimensions = 512,
}: {
/** TwelveLabs API key. Falls back to the `TWELVELABS_API_KEY` environment variable. */
apiKey?: string;
/** Marengo model name. Defaults to `marengo3.0`. */
model?: string;
/** Embedding dimensions returned by the model. Defaults to `512` (marengo3.0). */
dimensions?: number;
} = {}) {
super();

const key = apiKey ?? process.env.TWELVELABS_API_KEY;
if (!key) {
throw new Error(
'TwelveLabs API key is required. Pass it via the `apiKey` option or set the TWELVELABS_API_KEY environment variable.',
);
}

this.client = new TwelveLabs({ apiKey: key });
this.model = model;
this.dimensions = dimensions;
}

override async getDimensions(): Promise<number> {
return this.dimensions;
}

override async embedDocuments(texts: string[]): Promise<number[][]> {
// The Marengo embed endpoint accepts a single text per request, so we fan out.
return Promise.all(texts.map((text) => this.embedQuery(text)));
}

override async embedQuery(text: string): Promise<number[]> {
const response = await this.client.embed.create({ modelName: this.model, text });
const float = response.textEmbedding?.segments?.[0]?.float;

if (!float) {
const reason = response.textEmbedding?.errorMessage ?? 'no embedding was returned';
throw new Error(`TwelveLabs Marengo did not return a text embedding: ${reason}`);
}

return float;
}
}
102 changes: 102 additions & 0 deletions models/embedjs-twelvelabs/src/twelvelabs-video-loader.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
import { RecursiveCharacterTextSplitter } from '@langchain/textsplitters';
import { TwelveLabs } from 'twelvelabs-js';
import createDebugMessages from 'debug';
import md5 from 'md5';

import { BaseLoader } from '@llm-tools/embedjs-interfaces';
import { cleanString } from '@llm-tools/embedjs-utils';

/**
* Loads a video into embedJs by analysing it with TwelveLabs' Pegasus video
* understanding model. Pegasus watches the video (visuals, motion and audio)
* and returns a natural language description, which is then chunked and embedded
* like any other text source. This lets you run RAG over video content directly
* from a public URL, without a separate transcription step.
*/
export class TwelveLabsVideoLoader extends BaseLoader<{ type: 'TwelveLabsVideoLoader' }> {
private readonly debug = createDebugMessages('embedjs:loader:TwelveLabsVideoLoader');
private readonly client: TwelveLabs;
private readonly url: string;
private readonly model: 'pegasus1.2' | 'pegasus1.5';
private readonly prompt: string;
private readonly maxTokens: number;

constructor({
url,
apiKey,
model = 'pegasus1.2',
prompt = 'Describe everything that happens in this video in detail, including the visuals, actions, spoken words and on-screen text.',
maxTokens = 2048,
chunkSize,
chunkOverlap,
}: {
/** Publicly accessible URL of the video file (direct link to raw media). */
url: string;
/** TwelveLabs API key. Falls back to the `TWELVELABS_API_KEY` environment variable. */
apiKey?: string;
/** Pegasus model name. Defaults to `pegasus1.2`. */
model?: 'pegasus1.2' | 'pegasus1.5';
/** Prompt that guides the analysis. Defaults to a detailed description prompt. */
prompt?: string;
/** Maximum response length in tokens. Defaults to `2048`. */
maxTokens?: number;
chunkSize?: number;
chunkOverlap?: number;
}) {
super(
`TwelveLabsVideoLoader_${md5(`${url}_${model}_${prompt}`)}`,

Check warning on line 47 in models/embedjs-twelvelabs/src/twelvelabs-video-loader.ts

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Refactor this code to not use nested template literals.

See more on https://sonarcloud.io/project/issues?id=llm-tools_embedJs&issues=AZ8AmBzcERxnqMwPvJQ5&open=AZ8AmBzcERxnqMwPvJQ5&pullRequest=227
{ url },
chunkSize ?? 2000,
chunkOverlap ?? 0,
);

const key = apiKey ?? process.env.TWELVELABS_API_KEY;
if (!key) {
throw new Error(
'TwelveLabs API key is required. Pass it via the `apiKey` option or set the TWELVELABS_API_KEY environment variable.',
);
}

this.client = new TwelveLabs({ apiKey: key });
this.url = url;
this.model = model;
this.prompt = prompt;
this.maxTokens = maxTokens;
}

override async *getUnfilteredChunks() {
const chunker = new RecursiveCharacterTextSplitter({
chunkSize: this.chunkSize,
chunkOverlap: this.chunkOverlap,
});

try {
const response = await this.client.analyze({
modelName: this.model,
video: { type: 'url', url: this.url },
prompt: this.prompt,
maxTokens: this.maxTokens,
});

const text = response.data;
if (!text) {
this.debug('Pegasus returned no analysis for video', this.url);
return;
}

this.debug(`Pegasus analysis (length ${text.length}) obtained for video`, this.url);

for (const chunk of await chunker.splitText(cleanString(text))) {
yield {
pageContent: chunk,
metadata: {
type: 'TwelveLabsVideoLoader' as const,
source: this.url,
},
};
}
} catch (e) {
this.debug('Could not analyze video', this.url, e);
}
}
}
Loading