diff --git a/pylon-docs/platform/training-data.mdx b/pylon-docs/platform/training-data.mdx index da899f1..181a4e7 100644 --- a/pylon-docs/platform/training-data.mdx +++ b/pylon-docs/platform/training-data.mdx @@ -1,36 +1,91 @@ ---- -description: Supercharge all of Pylon's AI features by connecting training data for - the AI to consume. -icon: binary-circle-check -'og:description': Supercharge all of Pylon's AI features by connecting training data - for the AI to consume. -'og:image': https://docs.usepylon.com/pylon-docs/~gitbook/ogimage/55LMxjNj9P2tA9DYAwaa -'og:title': Training Data | Pylon -title: Training Data -'twitter:description': Supercharge all of Pylon's AI features by connecting training - data for the AI to consume. -'twitter:image': https://docs.usepylon.com/pylon-docs/~gitbook/ogimage/55LMxjNj9P2tA9DYAwaa -'twitter:title': Training Data | Pylon ---- -Pylon's many AI features, like [Ask AI](/pylon-docs/platform/ask-ai), [Copilot](/pylon-docs/support-workflows/issues/copilot) for Issues, [AI Agents](/pylon-docs/ai-agents/overview), [Federated Search](/pylon-docs/knowledge-base/search#federated-search), and more all benefit from having access to more information. - -By default, they have access to your past Pylon issues and your Pylon Knowledge base. Make them even more powerful by connecting any external sources of knowledge you have! - -## Setup - -1. Go to the [Training Data](https://app.usepylon.com/settings/ai-controls/training-data) page in Settings and click "Import Data". -2. Select a method of importing data and follow the instructions. - - - - - -* Your Pylon knowledge base and past issues will be enabled by default and reindexed **live** -* Public URLs to individual webpages or a base url to crawl that will be reindexed every 7 or 30 days depending on your Pylon plan. Examples may include: - + Public-facing documentation not hosted by Pylon - + Public Google docs - + Public GitHub repos - + Your marketing website or pricing page -* Give Pylon a few moments to automatically scrape and index all your content - -This data is now available to power all your AI features! +--- +description: Supercharge all of Pylon's AI features by connecting training data for + the AI to consume. +icon: binary-circle-check +'og:description': Supercharge all of Pylon's AI features by connecting training data + for the AI to consume. +'og:image': https://docs.usepylon.com/pylon-docs/~gitbook/ogimage/55LMxjNj9P2tA9DYAwaa +'og:title': Training Data | Pylon +title: Training Data +'twitter:description': Supercharge all of Pylon's AI features by connecting training + data for the AI to consume. +'twitter:image': https://docs.usepylon.com/pylon-docs/~gitbook/ogimage/55LMxjNj9P2tA9DYAwaa +'twitter:title': Training Data | Pylon +--- + +Pylon's many AI features, like [Ask AI](/pylon-docs/platform/ask-ai), [Copilot](/pylon-docs/support-workflows/issues/copilot) for Issues, [AI Agents](/pylon-docs/ai-agents/overview), [Federated Search](/pylon-docs/knowledge-base/search#federated-search), and more all benefit from having access to more information. + +By default, they have access to your past Pylon issues and your Pylon Knowledge base. Make them even more powerful by connecting any external sources of knowledge you have! + +## Setup + +1. Go to the [Training Data](https://app.usepylon.com/settings/ai-controls/training-data) page in Settings and click "Import Data". +2. Select a method of importing data and follow the instructions. + + + + + +* Your Pylon knowledge base and past issues will be enabled by default and reindexed **live** +* Public URLs to individual webpages or a base url to crawl that will be reindexed every 7 or 30 days depending on your Pylon plan. Examples may include: + + Public-facing documentation not hosted by Pylon + + Public Google docs + + Public GitHub repos + + Your marketing website or pricing page +* Give Pylon a few moments to automatically scrape and index all your content + +This data is now available to power all your AI features! + +## API Integration + + +For programmatic data ingestion, Pylon provides Training Data APIs that allow you to upload documents directly without publishing them to the public internet. + + +### Upload Files + +**Endpoint:** `POST /training-data/upload` + +Upload files to a new or existing training data source via multipart form upload. + +| Parameter | Required | Description | +|-----------|----------|-------------| +| `training_data_id` | No | ID of an existing data source to append to | +| `training_data_name` | Yes (for new sources) | Name for the new training data source | +| `visibility` | No | Access control: `everyone`, `ai_agent_only`, or `user_only` | +| `files` | Yes | One or more files to upload | + + +**Limits:** Maximum 100MB per file, 500MB total per request. + + +### Upload Text Content + +**Endpoint:** `POST /training-data/upload-content` + +Upload text content directly as a file to a new or existing training data source. + +| Parameter | Required | Description | +|-----------|----------|-------------| +| `training_data_id` | No | ID of an existing data source to append to | +| `training_data_name` | Yes (for new sources) | Name for the new training data source | +| `content` | Yes | Text content to upload (max 100MB) | +| `file_name` | Yes | Name for the uploaded file | +| `visibility` | No | Access control setting | +| `external_id` | No | For idempotent updates—matches existing documents | + + +Supports form-encoded, JSON, and Zapier POST requests. + + +### Key Behaviors + +* **New sources**: When no `training_data_id` is provided, a new training data source is created +* **Appending**: When an ID is supplied, content is appended to the existing source +* **Idempotent updates**: Use `external_id` to update existing documents without creating duplicates + +### Automation + +Zapier integration is available for automated document routing as your content updates. This allows you to automatically sync documents to Pylon as they change in your source systems. + +For full API details, see the [Training Data APIs](https://support.usepylon.com/articles/4399577046-open-preview-training-data-apis) support article.