From 2bbaa5c4a1825bda79ade2e152fccf1f0bfe3767 Mon Sep 17 00:00:00 2001 From: nscuro Date: Tue, 2 Jun 2026 22:18:14 +0200 Subject: [PATCH] Add init tasks docs Signed-off-by: nscuro --- docs/concepts/architecture/deployment.md | 5 +- .../administration/configuring-database.md | 32 ++----- .../configuring-observability.md | 5 +- .../administration/upgrading-instances.md | 7 +- docs/guides/upgrading/v0.7.0-alpha.3.md | 8 +- docs/guides/upgrading/v5.0.0-rc.2.md | 2 +- docs/reference/configuration/.pages | 1 + docs/reference/configuration/database.md | 37 ++++---- docs/reference/configuration/datasources.md | 19 ++++ docs/reference/configuration/init-tasks.md | 86 +++++++++++++++++ docs/reference/index.md | 95 ++----------------- 11 files changed, 149 insertions(+), 148 deletions(-) create mode 100644 docs/reference/configuration/init-tasks.md diff --git a/docs/concepts/architecture/deployment.md b/docs/concepts/architecture/deployment.md index 03ee456..a34ce0b 100644 --- a/docs/concepts/architecture/deployment.md +++ b/docs/concepts/architecture/deployment.md @@ -38,8 +38,9 @@ flowchart LR - **Frontend.** Static Vue.js single-page app, typically served by the same load balancer or a CDN. Stateless. -Each instance exposes a separate management server for health and metrics that starts before init -tasks such as schema migration, so probes stay reachable while the main server initializes. +Each instance exposes a separate management server for health and metrics that starts before +[init tasks](../../reference/configuration/init-tasks.md) such as schema migration, so probes +stay reachable while the main server initializes. ## Coordination diff --git a/docs/guides/administration/configuring-database.md b/docs/guides/administration/configuring-database.md index 31fe3c6..f1be0be 100644 --- a/docs/guides/administration/configuring-database.md +++ b/docs/guides/administration/configuring-database.md @@ -26,7 +26,7 @@ or extensions of it (for example [Neon] or [Timescale]), will most definitely wo The same is not necessarily true for platforms based on heavily modified PostgreSQL, or even entire re-implementations such as [CockroachDB] or [YugabyteDB]. Such solutions make certain trade-offs to achieve higher levels of scalability, -which might impact functionality that Dependency-Track relies on. If you'd like to see support for those, please [let us know]. +which might impact functionality that Dependency-Track relies on. ### Self-hosting @@ -172,12 +172,12 @@ centralised connection pools such as [PgBouncer]. Before proceeding, take note of the following constraints: * Only `session` and `transaction` pooling modes are supported. `transaction` is recommended. -* Initialisation tasks, which include database migrations, **must** connect to the +* [Init tasks](../../reference/configuration/init-tasks.md) **must** connect to the database directly, bypassing the connection pooler, when using pooling mode `transaction`. - * To prevent concurrent initialisation, session-level PostgreSQL advisory locks are used, - which are not supported with the `transaction` pooling mode. - * To facilitate this, initialisation tasks can be executed in dedicated containers, - and / or using separate data sources. + Init tasks coordinate across instances using session-level PostgreSQL advisory locks, + which are not supported with the `transaction` pooling mode. Route them through a + separate data source that connects to PostgreSQL directly, run them in a + dedicated container, or configure both. ### Example @@ -224,25 +224,6 @@ services: DT_INIT_TASKS_DATASOURCE_NAME: "init" ``` -## Schema migration credentials - -It is possible to use different credentials for migrations than for the application itself. -This can be achieved by configuring a separate data source, and instructing the init tasks to use it: - -```ini linenums="1" -# Configure a data source named "init-tasks". -dt.datasource.init-tasks.url=jdbc:postgresql://localhost:5432/dtrack -dt.datasource.init-tasks.username=my-init-task-user -dt.datasource.init-tasks.password=my-init-task-pass - -# Configure init tasks to use the data source. -init.tasks.datasource.name=init-tasks - -# If the data source is only meant for init tasks, there is no need to -# keep it around after the tasks completed. -init.tasks.datasource.close-after-use=true -``` - [Autovacuum]: https://www.postgresql.org/docs/current/routine-vacuuming.html [CockroachDB]: https://www.cockroachlabs.com/ [Neon]: https://neon.tech/ @@ -251,6 +232,5 @@ init.tasks.datasource.close-after-use=true [Postgres Operator]: https://github.com/zalando/postgres-operator [Timescale]: https://www.timescale.com/ [YugabyteDB]: https://www.yugabyte.com/ -[let us know]: https://github.com/DependencyTrack/hyades/issues/new?assignees=&labels=enhancement&projects=&template=enhancement-request.yml [list of well-known commercial hosting providers]: https://www.postgresql.org/support/professional_hosting/ [official upgrading guide]: https://www.postgresql.org/docs/current/upgrading.html diff --git a/docs/guides/administration/configuring-observability.md b/docs/guides/administration/configuring-observability.md index b5a1e2d..ab15e41 100644 --- a/docs/guides/administration/configuring-observability.md +++ b/docs/guides/administration/configuring-observability.md @@ -41,7 +41,10 @@ containers: periodSeconds: 5 ``` -The aggregate endpoint `/health` returns the combined status of all checks. +The startup probe at `/health/started` reports per-task progress while +[init tasks](../../reference/configuration/init-tasks.md) run, then turns +healthy once the main server is ready. The aggregate endpoint `/health` +returns the combined status of all checks. ## Enabling Prometheus metrics scraping diff --git a/docs/guides/administration/upgrading-instances.md b/docs/guides/administration/upgrading-instances.md index 820f6fb..414952f 100644 --- a/docs/guides/administration/upgrading-instances.md +++ b/docs/guides/administration/upgrading-instances.md @@ -11,7 +11,7 @@ The durable execution engine, leader election, and to replace one API server instance at a time as long as: - The release notes describe no breaking schema changes that require all instances to stop. -- The destination version's Liquibase migrations are additive against the running schema. +- The destination version's Flyway migrations are additive against the running schema. - More than one instance is running. A single-instance deployment cannot upgrade without downtime. If any of these conditions fail, plan a full-stop upgrade window instead. @@ -23,7 +23,7 @@ If any of these conditions fail, plan a full-stop upgrade window instead. full-stop upgrade if any release calls for one. - If you run schema migrations in a dedicated container (recommended for large deployments and PgBouncer in transaction mode), plan to run it before any new-version API server starts. See - [Schema migration credentials](configuring-database.md#schema-migration-credentials). + [init-only containers](../../reference/configuration/init-tasks.md#init-only-containers). ## Rolling upgrade @@ -38,9 +38,6 @@ condition listed earlier is what makes that safe: the old version must keep work migrated schema for the duration of the roll-out. Cap the roll-out to a single deploy window. Do not leave the cluster running mixed versions indefinitely. -If a migration fails partway and leaves the Liquibase change-log lock held, clear the lock in -PostgreSQL before retrying the upgrade. - ## Web/worker split When the cluster [splits API and worker traffic](scaling.md#separate-api-traffic-from-background-work), diff --git a/docs/guides/upgrading/v0.7.0-alpha.3.md b/docs/guides/upgrading/v0.7.0-alpha.3.md index 0fe27ae..c824fee 100644 --- a/docs/guides/upgrading/v0.7.0-alpha.3.md +++ b/docs/guides/upgrading/v0.7.0-alpha.3.md @@ -140,7 +140,8 @@ Update your configurations at your earliest convenience. * **Health and metrics endpoints have moved to a dedicated management server** that listens - on a separate port (default `9000`). The management server starts before init tasks, making + on a separate port (default `9000`). The management server starts before + [init tasks](../../reference/configuration/init-tasks.md), making health probes available during the entire startup sequence. | Endpoint | Before | After | @@ -160,7 +161,8 @@ * `init.tasks.database.url` * `init.tasks.database.username` * `init.tasks.database.password` -* Refer to [schema migration credentials](../administration/configuring-database.md#schema-migration-credentials) - for an example of how to run init tasks with separate database credentials. +* For current init task data source configuration details, see + [Init tasks](../../reference/configuration/init-tasks.md#data-source). + (This release uses `init.tasks.datasource.name`.) [hyades/#1910]: https://github.com/DependencyTrack/hyades/issues/1910 diff --git a/docs/guides/upgrading/v5.0.0-rc.2.md b/docs/guides/upgrading/v5.0.0-rc.2.md index 335a8ab..4dffba6 100644 --- a/docs/guides/upgrading/v5.0.0-rc.2.md +++ b/docs/guides/upgrading/v5.0.0-rc.2.md @@ -78,7 +78,7 @@ | `dt.http.timeout.connection` (seconds) | `dt.http.connect-timeout-ms` (milliseconds, default `30000`) | | `dt.no.proxy` | `dt.http.proxy.exclusions` | - **Init tasks** + **[Init tasks](../../reference/configuration/init-tasks.md)** | Old | New | | --- | --- | diff --git a/docs/reference/configuration/.pages b/docs/reference/configuration/.pages index 3917c38..df43778 100644 --- a/docs/reference/configuration/.pages +++ b/docs/reference/configuration/.pages @@ -5,6 +5,7 @@ nav: - Database: database.md - Durable execution engine: dex-engine.md - File storage: file-storage.md + - Init tasks: init-tasks.md - Task scheduler: task-scheduler.md - Telemetry: telemetry.md - All properties: properties.md diff --git a/docs/reference/configuration/database.md b/docs/reference/configuration/database.md index 10789e6..50adff1 100644 --- a/docs/reference/configuration/database.md +++ b/docs/reference/configuration/database.md @@ -2,32 +2,32 @@ Dependency-Track requires a [PostgreSQL], or PostgreSQL-compatible database to operate. -The lowest supported version is 14. You are encouraged to use the [newest available version]. +The lowest supported version is 14. Prefer the [newest available version]. For guidance on choosing a hosting solution, deploying, and tuning PostgreSQL, see the [database configuration guide](../../guides/administration/configuring-database.md). ## Extensions -The following PostgreSQL [extensions](https://www.postgresql.org/docs/current/external-extensions.html) -are **required** by Dependency-Track. When choosing a hosting solution, verify that the extensions listed -here are supported. +Dependency-Track **requires** the following PostgreSQL +[extensions](https://www.postgresql.org/docs/current/external-extensions.html). +When choosing a hosting solution, verify it supports them. * [`pg_trgm`](https://www.postgresql.org/docs/current/pgtrgm.html): *Support for similarity of text using trigram matching* !!! note - Dependency-Track will execute the necessary `CREATE EXTENSION IF NOT EXISTS` statements - during [schema migration](#schema-migrations). Enabling extensions manually is not necessary. + Dependency-Track executes the necessary `CREATE EXTENSION IF NOT EXISTS` statements + during [schema migration](#schema-migrations). You do not need to enable extensions manually. -Generally, usage of extensions is limited to those that: +Dependency-Track limits extension usage to those that: 1. Ship with PostgreSQL [out-of-the-box](https://www.postgresql.org/docs/current/contrib.html) 2. Are [trusted](https://www.postgresql.org/about/featurematrix/detail/347/) by default -## Tuning Parameters +## Tuning parameters -The following PostgreSQL parameters are recommended for Dependency-Track deployments. -For context on when and why to apply these, see the +Dependency-Track recommends the following PostgreSQL parameters for production +deployments. For context on when and why to apply these, see the [advanced tuning guide](../../guides/administration/configuring-database.md#advanced-tuning). ### `autovacuum_vacuum_scale_factor` @@ -107,18 +107,13 @@ For context on when and why to apply these, see the -## Schema Migrations +## Schema migrations -Schema migrations are performed automatically by the API server upon startup using [Liquibase]. -Usually no manual action is required when upgrading from an older Dependency-Track version, unless explicitly -stated otherwise in the release notes. +By default, schema migrations run on startup as an [init task](init-tasks.md), using [Flyway]. +Upgrading from an older Dependency-Track version requires no manual action, +unless the [upgrade guides](../../guides/upgrading/index.md) explicitly state +otherwise. -This behaviour can be turned off by setting [`init.tasks.enabled`](properties.md#dtinittasksenabled) -on the API server container to `false`. - -For configuring separate migration credentials, see the -[schema migration credentials guide](../../guides/administration/configuring-database.md#schema-migration-credentials). - -[Liquibase]: https://www.liquibase.com/ +[Flyway]: https://www.red-gate.com/products/flyway/ [PostgreSQL]: https://www.postgresql.org/ [newest available version]: https://www.postgresql.org/support/versioning/ diff --git a/docs/reference/configuration/datasources.md b/docs/reference/configuration/datasources.md index f9998c6..417c64e 100644 --- a/docs/reference/configuration/datasources.md +++ b/docs/reference/configuration/datasources.md @@ -51,6 +51,24 @@ via [`dt.secret-management.database.datasource.name`](properties.md#dtsecret-man data sources as you like, but unless they're being used by a feature, they will not be created. +## Privileges + +The user configured for the `default` data source must hold privileges to perform +DDL against the Dependency-Track schema. The API server issues DDL during normal +operation: + +* On startup, by running [init tasks](init-tasks.md) when configured to use the `default` + data source (schema migrations, extension creation, partition setup, seeding). +* At runtime, by creating and dropping [table partitions] to manage + [time series metrics retention](../../concepts/time-series-metrics.md#daily-partitions-and-bounded-retention). + +!!! warning + Configuring an unprivileged user for the `default` data source is not supported. + A user restricted to `SELECT`, `INSERT`, `UPDATE`, and `DELETE` will cause runtime + failures even after a successful startup. + +Grant the user ownership of the database, or of the Dependency-Track schema within it. + ## Connection Pool Properties The following properties control local connection pooling per data source: @@ -73,3 +91,4 @@ For centralised connection pooling with PgBouncer, see the [database configuration guide](../../guides/administration/configuring-database.md#centralised-connection-pooling). [configuration reference]: database.md +[table partitions]: https://www.postgresql.org/docs/current/ddl-partitioning.html diff --git a/docs/reference/configuration/init-tasks.md b/docs/reference/configuration/init-tasks.md new file mode 100644 index 0000000..fd9631d --- /dev/null +++ b/docs/reference/configuration/init-tasks.md @@ -0,0 +1,86 @@ +# Init tasks + +Init tasks are one-time operations the API server performs at startup, before +the main server begins serving traffic. They prepare the database schema, seed +default objects, and bring the durable execution engine to a consistent state. + +## Lifecycle + +Init tasks run before the HTTP listener starts and before background workers +initialize. They block startup. If any task fails, the JVM exits and the +container does not begin serving traffic. + +The management server starts before init tasks run and exposes the +[startup health endpoint](../../guides/administration/configuring-observability.md#configuring-kubernetes-health-probes) +at `/health/started`, which reports per-task status (`STARTED`, `COMPLETED`, +`FAILED`) during execution. + +By default, init tasks run in every API server container. To coordinate across +instances, the executor acquires a session-level PostgreSQL [advisory lock], +so only one container performs each task at a time. Remaining containers wait +for the lock, then confirm there is nothing left to do. + +!!! warning + Session-level advisory locks are incompatible with PgBouncer in + `transaction` pooling mode. When using a transaction-mode connection + pooler, configure a separate [init data source](#data-source) that + connects directly to PostgreSQL. + +## Tasks + +| Task | Purpose | Enable property | +|------|---------|-----------------| +| Database migration | Runs [Flyway](database.md#schema-migrations) migrations against the Dependency-Track schema. Creates required [extensions](database.md#extensions). | [`dt.init-task.database-migration.enabled`](properties.md#dtinit-taskdatabase-migrationenabled) | +| Database partition maintenance | Creates [table partitions] required for [time series metrics](../../concepts/time-series-metrics.md). | [`dt.init-task.database-partition-maintenance.enabled`](properties.md#dtinit-taskdatabase-partition-maintenanceenabled) | +| Database seeding | Populates default permissions, teams, users, licenses, license groups, repositories, and configuration properties. Idempotent. Skips on later startups when the build identifier has not changed. | [`dt.init-task.database-seeding.enabled`](properties.md#dtinit-taskdatabase-seedingenabled) | +| Dex engine database migration | Runs schema migrations for the [durable execution engine](dex-engine.md). | [`dt.init-task.dex-engine-database-migration.enabled`](properties.md#dtinit-taskdex-engine-database-migrationenabled) | + +Per-task enable properties take effect only when +[`dt.init-tasks.enabled`](properties.md#dtinit-tasksenabled) is `true`. + +## Data source + +By default, init tasks use the `default` [data source](datasources.md). +Override this with +[`dt.init-tasks.datasource.name`](properties.md#dtinit-tasksdatasourcename) +to route them through a separate connection pool. + +The most common reason: centralized connection pooling with PgBouncer in +`transaction` mode, which requires init tasks to bypass the pooler and +connect to PostgreSQL directly. The default data source connects through +PgBouncer, while the init data source connects to PostgreSQL directly so +session-level advisory locks remain usable. See +[centralized connection pooling](../../guides/administration/configuring-database.md#centralised-connection-pooling) +for an example. + +When the init data source exists solely to serve init tasks, set +[`dt.init-tasks.datasource.close-after-completion`](properties.md#dtinit-tasksdatasourceclose-after-completion) +to `true`. The API server closes the connection pool once tasks finish, +freeing its connections. + +## Init-only containers + +Setting +[`dt.init-tasks.exit-after-completion`](properties.md#dtinit-tasksexit-after-completion) +to `true` causes the JVM to exit with status `0` once init tasks succeed, +without starting the main server. This supports a dedicated init container +pattern, where a short-lived container runs init tasks and exits before +long-lived API server containers start. + +In this pattern, the long-lived containers set +[`dt.init-tasks.enabled`](properties.md#dtinit-tasksenabled) to `false` to +skip init tasks on startup. + +## Configuration + +* [`dt.init-tasks.enabled`](properties.md#dtinit-tasksenabled) +* [`dt.init-tasks.datasource.name`](properties.md#dtinit-tasksdatasourcename) +* [`dt.init-tasks.datasource.close-after-completion`](properties.md#dtinit-tasksdatasourceclose-after-completion) +* [`dt.init-tasks.exit-after-completion`](properties.md#dtinit-tasksexit-after-completion) +* [`dt.init-task.database-migration.enabled`](properties.md#dtinit-taskdatabase-migrationenabled) +* [`dt.init-task.database-partition-maintenance.enabled`](properties.md#dtinit-taskdatabase-partition-maintenanceenabled) +* [`dt.init-task.database-seeding.enabled`](properties.md#dtinit-taskdatabase-seedingenabled) +* [`dt.init-task.dex-engine-database-migration.enabled`](properties.md#dtinit-taskdex-engine-database-migrationenabled) + +[advisory lock]: https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS +[table partitions]: https://www.postgresql.org/docs/current/ddl-partitioning.html diff --git a/docs/reference/index.md b/docs/reference/index.md index a15640b..aeef58a 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -1,92 +1,9 @@ # Reference -Reference documentation describes Dependency-Track's technical interfaces, -configuration properties, and data schemas. It is designed to be consulted -rather than read from start to finish. +Technical descriptions of Dependency-Track's interfaces, configuration +properties, and data schemas. Look up what you need rather than reading top +to bottom. -For step-by-step instructions, see [Guides](../guides/index.md). -For background and explanations, see [Concepts](../concepts/index.md). - -## API - -- [REST API v1](api/v1.md) -- [REST API v2](api/v2.md) - -## Configuration - -- [Application](configuration/application.md) -- - general application settings and MicroProfile Config sources -- [Data Sources](configuration/datasources.md) -- - database connection and pool configuration -- [File Storage](configuration/file-storage.md) -- - local and S3-compatible storage providers -- [Database](configuration/database.md) -- - PostgreSQL requirements, extensions, and tuning parameters -- [All Properties](configuration/properties.md) -- - complete generated registry of all application properties - -## Datasources - -For background on what data sources contribute and how to enable them, see -[About vulnerability data sources](../concepts/about-vulnerability-data-sources.md) and -[Configuring vulnerability sources](../guides/administration/configuring-vulnerability-sources.md). - -- [Datasources overview](datasources/index.md) -- - mirrored sources (NVD, GitHub Advisories, OSV) and the other sources Dependency-Track integrates with -- [Private Vulnerability Repository](datasources/private-vulnerability-repository.md) -- - internally managed vulnerabilities for proprietary components -- [Repositories](datasources/repositories.md) -- - package registries for outdated component detection -- [Internal Components](datasources/internal-components.md) -- - excluding first-party components from external analysis - -## Notifications - -- [Publishers](notifications/publishers.md) -- - email, Jira, Kafka, Webhook, and other publisher options -- [Groups](notifications/groups.md) -- - the catalog of events Dependency-Track emits notifications for -- [Filter Expressions](notifications/filter-expressions.md) -- - CEL-based notification filtering - -## Vulnerability Analysis - -- [Vulnerability Analyzers](analyzers.md) -- - internal and external analyzers used to identify vulnerabilities - -## Policies - -- [Policies](policies/index.md) -- - overview of component and vulnerability policies -- [Component Policies](policies/component-policies.md) -- - field definitions, condition subjects, operators, and assignment -- [Vulnerability Policies](policies/vulnerability-policies.md) -- - field definitions, bundle YAML schema, and sync configuration -- [Condition Expressions](policies/condition-expressions.md) -- - inputs and custom functions for policy conditions - -## CEL Expressions - -- [CEL Expressions](cel-expressions.md) -- - shared CEL syntax primer used by both policies and notification filters - -## Access Control - -- [Permissions](permissions.md) -- - users, teams, API keys, and the full permissions table - -## Integrations - -- [Badges](badges.md) -- - SVG badges for embedding vulnerability and policy metrics -- [File Formats](file-formats.md) -- - CycloneDX BOM/VEX/VDR and Finding Packaging Format (FPF) -- [Community Integrations](community-integrations.md) -- - third-party tools and libraries built on the Dependency-Track API - -## Schemas - -- [Notification Schema](schemas/notification.md) -- - Protobuf definitions for notification subjects -- [Policy Schema](schemas/policy.md) -- - Protobuf definitions for the policy CEL evaluation context +Use the navigation to browse by topic. For step-by-step instructions, see +[Guides](../guides/index.md). For background and explanations, see +[Concepts](../concepts/index.md).