Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 91 additions & 36 deletions pylon-docs/platform/training-data.mdx
Original file line number Diff line number Diff line change
@@ -1,36 +1,91 @@
---
description: Supercharge all of Pylon's AI features by connecting training data for
the AI to consume.
icon: binary-circle-check
'og:description': Supercharge all of Pylon's AI features by connecting training data
for the AI to consume.
'og:image': https://docs.usepylon.com/pylon-docs/~gitbook/ogimage/55LMxjNj9P2tA9DYAwaa
'og:title': Training Data | Pylon
title: Training Data
'twitter:description': Supercharge all of Pylon's AI features by connecting training
data for the AI to consume.
'twitter:image': https://docs.usepylon.com/pylon-docs/~gitbook/ogimage/55LMxjNj9P2tA9DYAwaa
'twitter:title': Training Data | Pylon
---
Pylon's many AI features, like [Ask AI](/pylon-docs/platform/ask-ai), [Copilot](/pylon-docs/support-workflows/issues/copilot) for Issues, [AI Agents](/pylon-docs/ai-agents/overview), [Federated Search](/pylon-docs/knowledge-base/search#federated-search), and more all benefit from having access to more information.

By default, they have access to your past Pylon issues and your Pylon Knowledge base. Make them even more powerful by connecting any external sources of knowledge you have!

## Setup

1. Go to the [Training Data](https://app.usepylon.com/settings/ai-controls/training-data) page in Settings and click "Import Data".
2. Select a method of importing data and follow the instructions.

<Frame>
<img alt="" class="block" data-testid="zoom-image" fetchpriority="high" height="415" sizes="(max-width: 640px) 400px, 768px" src="/images/d9e1f287.png" width="868"/>
</Frame>

* Your Pylon knowledge base and past issues will be enabled by default and reindexed **live**
* Public URLs to individual webpages or a base url to crawl that will be reindexed every 7 or 30 days depending on your Pylon plan. Examples may include:
+ Public-facing documentation not hosted by Pylon
+ Public Google docs
+ Public GitHub repos
+ Your marketing website or pricing page
* Give Pylon a few moments to automatically scrape and index all your content

This data is now available to power all your AI features!
---
description: Supercharge all of Pylon's AI features by connecting training data for
the AI to consume.
icon: binary-circle-check
'og:description': Supercharge all of Pylon's AI features by connecting training data
for the AI to consume.
'og:image': https://docs.usepylon.com/pylon-docs/~gitbook/ogimage/55LMxjNj9P2tA9DYAwaa
'og:title': Training Data | Pylon
title: Training Data
'twitter:description': Supercharge all of Pylon's AI features by connecting training
data for the AI to consume.
'twitter:image': https://docs.usepylon.com/pylon-docs/~gitbook/ogimage/55LMxjNj9P2tA9DYAwaa
'twitter:title': Training Data | Pylon
---

Pylon's many AI features, like [Ask AI](/pylon-docs/platform/ask-ai), [Copilot](/pylon-docs/support-workflows/issues/copilot) for Issues, [AI Agents](/pylon-docs/ai-agents/overview), [Federated Search](/pylon-docs/knowledge-base/search#federated-search), and more all benefit from having access to more information.

By default, they have access to your past Pylon issues and your Pylon Knowledge base. Make them even more powerful by connecting any external sources of knowledge you have!

## Setup

1. Go to the [Training Data](https://app.usepylon.com/settings/ai-controls/training-data) page in Settings and click "Import Data".
2. Select a method of importing data and follow the instructions.

<Frame>
<img alt="" class="block" data-testid="zoom-image" fetchpriority="high" height="415" sizes="(max-width: 640px) 400px, 768px" src="/images/d9e1f287.png" width="868"/>
</Frame>

* Your Pylon knowledge base and past issues will be enabled by default and reindexed **live**
* Public URLs to individual webpages or a base url to crawl that will be reindexed every 7 or 30 days depending on your Pylon plan. Examples may include:
+ Public-facing documentation not hosted by Pylon
+ Public Google docs
+ Public GitHub repos
+ Your marketing website or pricing page
* Give Pylon a few moments to automatically scrape and index all your content

This data is now available to power all your AI features!

## API Integration

<Info>
For programmatic data ingestion, Pylon provides Training Data APIs that allow you to upload documents directly without publishing them to the public internet.
</Info>

### Upload Files

**Endpoint:** `POST /training-data/upload`

Upload files to a new or existing training data source via multipart form upload.

| Parameter | Required | Description |
|-----------|----------|-------------|
| `training_data_id` | No | ID of an existing data source to append to |
| `training_data_name` | Yes (for new sources) | Name for the new training data source |
| `visibility` | No | Access control: `everyone`, `ai_agent_only`, or `user_only` |
| `files` | Yes | One or more files to upload |

<Info>
**Limits:** Maximum 100MB per file, 500MB total per request.
</Info>

### Upload Text Content

**Endpoint:** `POST /training-data/upload-content`

Upload text content directly as a file to a new or existing training data source.

| Parameter | Required | Description |
|-----------|----------|-------------|
| `training_data_id` | No | ID of an existing data source to append to |
| `training_data_name` | Yes (for new sources) | Name for the new training data source |
| `content` | Yes | Text content to upload (max 100MB) |
| `file_name` | Yes | Name for the uploaded file |
| `visibility` | No | Access control setting |
| `external_id` | No | For idempotent updates—matches existing documents |

<Info>
Supports form-encoded, JSON, and Zapier POST requests.
</Info>

### Key Behaviors

* **New sources**: When no `training_data_id` is provided, a new training data source is created
* **Appending**: When an ID is supplied, content is appended to the existing source
* **Idempotent updates**: Use `external_id` to update existing documents without creating duplicates

### Automation

Zapier integration is available for automated document routing as your content updates. This allows you to automatically sync documents to Pylon as they change in your source systems.

For full API details, see the [Training Data APIs](https://support.usepylon.com/articles/4399577046-open-preview-training-data-apis) support article.