From 9cacf0e8f806fb21ce98f42ff82ee621a82ae713 Mon Sep 17 00:00:00 2001 From: Casper Nielsen Date: Mon, 11 May 2026 13:14:29 +0200 Subject: [PATCH 1/3] feat(configuration): add docs for git as config store Signed-off-by: Casper Nielsen --- .../git-configuration-store.md | 315 ++++++++++++++++++ 1 file changed, 315 insertions(+) create mode 100644 daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md diff --git a/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md b/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md new file mode 100644 index 00000000000..663059faf1d --- /dev/null +++ b/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md @@ -0,0 +1,315 @@ +--- +type: docs +title: "Git" +linkTitle: "Git" +description: Detailed information on the Git configuration store component +--- + +## Component format + +To set up a Git configuration store, create a component of type `configuration.git`. See [this guide]({{% ref "howto-manage-configuration.md#configure-a-dapr-configuration-store" %}}) on how to create and apply a configuration store configuration. + +```yaml +apiVersion: dapr.io/v1alpha1 +kind: Component +metadata: + name: +spec: + type: configuration.git + version: v1 + metadata: + - name: url + value: "https://github.com/example/agent-config.git" + # Optional: branch to track + - name: branch + value: "main" + # Optional: subdirectory inside the repo to scope + - name: path + value: "." + # Optional: how often to poll the upstream for new commits + - name: pollInterval + value: "30s" + # Optional: file/agentYaml/prompty + - name: mappingMode + value: "file" + # Optional: none/pat/ssh/githubApp — auto-detected when empty + - name: authMode + value: "none" +``` + +{{% alert title="Warning" color="warning" %}} +The above example uses plaintext metadata values. When using an authenticated `authMode` (`pat`, `ssh`, `githubApp`), reference credentials from a [secret store]({{% ref component-secrets.md %}}) instead. See [Authentication](#authentication) below. +{{% /alert %}} + +## Spec metadata fields + +### General + +| Field | Required | Details | Example | +|-------|:--------:|---------|---------| +| `url` | Y | Git URL of the upstream repository. Supports `https://`, `ssh://`, `git@host:org/repo` (SCP-style), and `file://` schemes. URLs with the `git@` or `ssh://` prefix auto-select the SSH auth profile when `authMode` is empty. `http://` is rejected for any authenticated mode to prevent cleartext credential transmission. Embedding credentials inline (`https://user:tok@host/`) is also rejected — use the appropriate auth profile with a Dapr secret reference. | `"https://github.com/example/agent-config.git"` | +| `branch` | N | Branch to track. | `"main"` (default) | +| `path` | N | Subdirectory inside the repository to scope. Must be repo-relative (no leading `/` and no `..` components). | `"agents/weather"`, `"."` (default) | +| `depth` | N | Clone depth. `0` (default) performs a full clone. `go-git`'s shallow incremental fetch has known limitations; full clones are the safe choice for anything but trivial config repos. | `"0"` (default) | +| `pollInterval` | N | How often to poll the upstream for changes. Hard floor is `1s` for remote URLs; `file://` URLs may go down to `100ms`. Intervals below `5s` log a warning at startup. At the default `30s`, a single instance issues ~120 requests/h — well below GitHub's 5000/h PAT and 15000/h GitHub App limits. Multi-replica deployments multiply that rate (`effective_rate = 3600/pollInterval × replicas`). | `"30s"` (default) | +| `fetchTimeout` | N | Per-fetch timeout applied to fetch and ls-remote operations. | `"30s"` (default) | +| `includeHidden` | N | When `false` (default), files whose name begins with `.` are skipped during the worktree walk. The `.git` directory is **always** excluded regardless of this flag — credentials in `.git/config` (e.g. from an inline-credential URL) can never leak into the configuration items. | `"false"` (default) | +| `maxFileSize` | N | Maximum per-file size in bytes that the walker will read into memory. Files larger than this are skipped with a warning. Protects the sidecar from OOM if a large blob is accidentally committed. | `"1048576"` (1 MiB default) | +| `snapshotCacheSize` | N | Number of past snapshots to retain in the LRU cache used as diff bases when computing per-subscriber update events. Higher values reduce over-emit churn when many subscribers are at slightly different commit positions. | `"4"` (default) | +| `emitInitialState` | N | When `true` (default), `Subscribe` synchronously delivers the current snapshot to the handler before returning — callers don't need a separate `Get` + `Subscribe` pair. Set to `false` if the caller already has fresh state and would receive a duplicate. | `"true"` (default) | +| `mappingMode` | N | Strategy for mapping repository files to configuration items. Matching is case-insensitive. See [Mapping modes](#mapping-modes). | `"file"` (default), `"agentYaml"`, `"prompty"` | +| `authMode` | N | Authentication profile to use. Matching is case-insensitive. When empty, the component auto-detects: SSH-scheme URLs → `ssh`; `githubAppId` set → `githubApp`; `token` set → `pat`; otherwise `none`. See [Authentication](#authentication). | `"pat"` | + +### Personal Access Token (`authMode: pat`) + +| Field | Required | Details | Example | +|-------|:--------:|---------|---------| +| `token` | Y | Personal access token used to authenticate. For GitHub, use a fine-grained PAT with repository read access. Sent as the password in HTTP basic auth. | `"ghp_xxxxxxxxxxxx"` | +| `username` | N | Username sent with the PAT. Defaults to `"x-access-token"`, which is the GitHub-recommended placeholder. Other providers may require a real username. | `"x-access-token"` (default) | + +### SSH (`authMode: ssh`) + +| Field | Required | Details | Example | +|-------|:--------:|---------|---------| +| `sshUser` | N | SSH user used when connecting. | `"git"` (default) | +| `sshPrivateKey` | Y* | PEM-encoded SSH private key. Either `sshPrivateKey` or `sshPrivateKeyPath` must be set. | `"-----BEGIN OPENSSH PRIVATE KEY-----\n..."` | +| `sshPrivateKeyPath` | Y* | Path to a PEM-encoded SSH private key on disk. Mutually exclusive with `sshPrivateKey`. | `"/var/run/secrets/git-ssh-key"` | +| `sshPassphrase` | N | Passphrase for the SSH private key, if encrypted. | | +| `sshKnownHosts` | Y** | Inline OpenSSH known_hosts entries used to verify the remote host key. Either `sshKnownHosts` or `sshKnownHostsPath` must be set unless `sshInsecureIgnoreHostKey` is `true`. Hostname is bound to key — a key registered for one host will not match another. | `"github.com ssh-rsa AAAA..."` | +| `sshKnownHostsPath` | Y** | Path to an OpenSSH `known_hosts` file on disk. | `"/etc/ssh/ssh_known_hosts"` | +| `sshInsecureIgnoreHostKey` | N | **DANGEROUS.** Disable SSH host key verification. Only use for local development or testing. When enabled, the component logs a warning at startup. Never enable in production; a MITM attacker can intercept configuration values. | `"false"` (default) | + +`*` Exactly one of `sshPrivateKey` / `sshPrivateKeyPath` is required. +`**` Exactly one of `sshKnownHosts` / `sshKnownHostsPath` is required unless `sshInsecureIgnoreHostKey: true`. + +### GitHub App (`authMode: githubApp`) + +The component mints an RS256 JWT, exchanges it for a 1-hour installation token, and refreshes the token before expiry. + +| Field | Required | Details | Example | +|-------|:--------:|---------|---------| +| `githubAppId` | Y | Numeric GitHub App ID. | `"123456"` | +| `githubAppInstallationId` | Y | Numeric GitHub App installation ID for the target organisation or repository. | `"78901234"` | +| `githubAppPrivateKey` | Y* | PEM-encoded RSA private key for the GitHub App. Accepts both PKCS#1 (`RSA PRIVATE KEY`) and PKCS#8 (`PRIVATE KEY`) encodings — GitHub Apps may be downloaded in either form. | `"-----BEGIN RSA PRIVATE KEY-----\n..."` | +| `githubAppPrivateKeyPath` | Y* | Path to a PEM-encoded RSA private key on disk. Mutually exclusive with `githubAppPrivateKey`. | `"/var/run/secrets/github-app-key.pem"` | +| `githubAppApiBase` | N | Base URL of the GitHub API. Override for GitHub Enterprise Server. Must use `https://`. | `"https://api.github.com"` (default) | +| `githubAppRefreshSkew` | N | Refresh the installation token when it has less than this much time left before expiry. | `"5m"` (default) | + +`*` Exactly one of `githubAppPrivateKey` / `githubAppPrivateKeyPath` is required. + +## Authentication + +The `authMode` metadata field selects an authentication profile. When `authMode` is empty, the component auto-detects in this order: + +1. URL begins with `git@` or `ssh://` → `ssh` +2. `githubAppId` is set → `githubApp` +3. `token` is set → `pat` +4. Otherwise → `none` + +Sensitive fields (`token`, `sshPrivateKey`, `sshPassphrase`, `githubAppPrivateKey`) should be sourced from a [Dapr secret store]({{% ref component-secrets.md %}}). Embedding credentials directly in the URL (e.g. `https://user:tok@host/repo`) is rejected at component init — operators must use a structured auth profile. + +The `auth.secretStore` field at the bottom of each example below names the [configured secret store component]({{% ref supported-secret-stores %}}) Dapr should use to resolve the `secretKeyRef` entries in `metadata`. When running in Kubernetes with a Kubernetes secret store, this field defaults to `kubernetes` and can be omitted. See [How-To: Reference secrets in components]({{% ref component-secrets.md %}}) for details. + +### Example: PAT with secret reference + +```yaml +apiVersion: dapr.io/v1alpha1 +kind: Component +metadata: + name: configstore +spec: + type: configuration.git + version: v1 + metadata: + - name: url + value: "https://github.com/example/private-config.git" + - name: authMode + value: "pat" + - name: token + secretKeyRef: + name: github-pat + key: token +auth: + # Name of the configured secret store component that holds the secrets + # referenced above. Defaults to "kubernetes" in K8s deployments. + secretStore: +``` + +### Example: SSH with deploy key + +```yaml +apiVersion: dapr.io/v1alpha1 +kind: Component +metadata: + name: configstore +spec: + type: configuration.git + version: v1 + metadata: + - name: url + value: "git@github.com:example/private-config.git" + - name: authMode + value: "ssh" + - name: sshPrivateKey + secretKeyRef: + name: git-ssh-deploy-key + key: privateKey + - name: sshKnownHosts + secretKeyRef: + name: git-ssh-known-hosts + key: knownHosts +auth: + # Name of the configured secret store component that holds the secrets + # referenced above. Defaults to "kubernetes" in K8s deployments. + secretStore: +``` + +### Example: GitHub App + +```yaml +apiVersion: dapr.io/v1alpha1 +kind: Component +metadata: + name: configstore +spec: + type: configuration.git + version: v1 + metadata: + - name: url + value: "https://github.com/example/private-config.git" + - name: authMode + value: "githubApp" + - name: githubAppId + value: "123456" + - name: githubAppInstallationId + value: "78901234" + - name: githubAppPrivateKey + secretKeyRef: + name: github-app-key + key: privateKey +auth: + # Name of the configured secret store component that holds the secrets + # referenced above. Defaults to "kubernetes" in K8s deployments. + secretStore: +``` + +## Mapping modes + +The `mappingMode` field selects how files in the repository become configuration items. Matching is case-insensitive. + +### `file` (default) + +Each file becomes one configuration item. The relative POSIX path is the key, the file contents are the value. + +```text +repo/ +├── agents/weather/agent_role.txt → key "agents/weather/agent_role.txt" +└── agents/weather/agent_goal.txt → key "agents/weather/agent_goal.txt" +``` + +This mode is the recommended choice when the consumer expects scalar configuration keys. + +### `agentYaml` + +Each `*.yaml`, `*.yml`, or `*.json` file is parsed as a flat top-level map. Each top-level field becomes a key prefixed by the filename stem with directory separators replaced by `_`. Non-YAML/JSON files are silently skipped. + +```yaml +# repo/agents/weather.yaml +agent_role: Weather expert +agent_goal: Help users plan trips +agent_instructions: + - be concise + - cite sources +``` + +Produces: + +```text +agents_weather/agent_role = "Weather expert" +agents_weather/agent_goal = "Help users plan trips" +agents_weather/agent_instructions = "- be concise\n- cite sources" (YAML-serialised) +``` + +Non-scalar field values round-trip via YAML re-serialisation — consumers can re-parse them with any YAML decoder. + +### `prompty` + +Each `*.prompty` file's YAML frontmatter and body are split. Frontmatter fields produce `/` keys (using the same directory-aware stem rules as `agentYaml`); the body is emitted as `/agent_system_prompt`. Non-`.prompty` files are skipped. + +```text +--- +name: Weather Agent +agent_role: Weather expert +agent_goal: Help users plan trips +--- +You are a friendly weather assistant. +``` + +Produces: + +```text +weather/name = "Weather Agent" +weather/agent_role = "Weather expert" +weather/agent_goal = "Help users plan trips" +weather/agent_system_prompt = "You are a friendly weather assistant." +``` + +## How it works + +### Polling + +On `Init`, the component clones the upstream repository into a temporary working directory and builds an initial snapshot from the worktree. A single polling goroutine then runs every `pollInterval`: + +1. Fetch the configured branch from the upstream. +2. If the remote tracking ref hasn't moved, do nothing. +3. Otherwise, hard-reset the worktree to the new tip, walk the files under `path`, run the configured mapping strategy, and install the new snapshot. +4. For each active subscriber, compute the diff against the snapshot the subscriber last saw and dispatch a notification. + +`Get` returns the most-recently-polled snapshot and may be up to `pollInterval` old. It does not contact the upstream — use `Subscribe` to receive change notifications in near real-time. + +### Subscriptions + +When `emitInitialState` is `true` (the default), `Subscribe` synchronously delivers the current snapshot to the handler before returning. This means callers can issue `Subscribe` without a preceding `Get`. If the initial delivery fails, the subscription is rolled back and the error is returned. + +Per-subscriber diffs are computed against an LRU cache of the last `snapshotCacheSize` snapshots keyed by commit SHA. On an LRU miss (subscriber sat through more commits than the cache holds without delivery), the diff degrades to a one-shot over-emit — every key is emitted as added or changed, which is idempotent on the receiver. + +### Deletion semantics + +When a key is removed in the upstream repo, the notification includes: + +```json +{ + "value": "", + "version": "", + "metadata": {"deleted": "true"} +} +``` + +The `deleted: true` sentinel distinguishes a removed key from a key set to the empty string. This is the same shape used by the [Kubernetes ConfigMap configuration store]({{% ref kubernetes-configmap-configuration-store.md %}}). + +### Versioning + +The version on every emitted item is the short (7-character) commit SHA of the upstream tip at the time of the snapshot. + +### Security considerations + +- `http://` URLs are rejected when any authenticated mode is in use to prevent cleartext credential transmission. Use `https://`, `ssh://`, or `file://`. +- Inline credentials in the URL (`https://user:token@host/repo`) are rejected. Always use a structured auth profile sourced from a Dapr secret store. +- The `.git` directory is always excluded from the worktree walk, regardless of `includeHidden`. This prevents the remote URL and any credentials stored in `.git/config` from leaking into configuration items. +- `sshInsecureIgnoreHostKey: true` is supported for development but loud-logged at startup. Production deployments must always provide `sshKnownHosts` or `sshKnownHostsPath`. +- Polling rate cumulatively counts against the git provider's rate limit. Multi-replica deployments multiply request volume — see the `pollInterval` row in the metadata table for the calculation. + +{{% alert title="Note" color="primary" %}} +The component is **read-only**. It never writes to the upstream repository. Configuration changes must be made by committing to the repo through your normal git workflow (PR review, branch protection, etc.). +{{% /alert %}} + +## Limitations + +- **Single GitHub App installation per component.** Multi-tenant routing (different repos via different installations on the same component) is not supported. + +## Related links + +- [Basic schema for a Dapr component]({{% ref component-schema.md %}}) +- [Configuration building block]({{% ref configuration-api-overview.md %}}) +- Read [How-To: Manage configuration from a store]({{% ref "howto-manage-configuration.md" %}}) for instructions on how to use a configuration store. +- [GitHub: dapr/components-contrib `configuration/git`](https://github.com/dapr/components-contrib/tree/main/configuration/git) From 89a8d3f9161093c2aa090a42ac7faa04d87d854a Mon Sep 17 00:00:00 2001 From: Casper Nielsen Date: Tue, 12 May 2026 12:21:05 +0200 Subject: [PATCH 2/3] chore: update post contrib review comments Signed-off-by: Casper Nielsen --- .../git-configuration-store.md | 135 +++++++++--------- 1 file changed, 68 insertions(+), 67 deletions(-) diff --git a/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md b/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md index 663059faf1d..4f19b3220d0 100644 --- a/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md +++ b/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md @@ -18,7 +18,7 @@ spec: type: configuration.git version: v1 metadata: - - name: url + - name: remoteUrl value: "https://github.com/example/agent-config.git" # Optional: branch to track - name: branch @@ -26,19 +26,18 @@ spec: # Optional: subdirectory inside the repo to scope - name: path value: "." - # Optional: how often to poll the upstream for new commits - - name: pollInterval - value: "30s" # Optional: file/agentYaml/prompty - name: mappingMode value: "file" - # Optional: none/pat/ssh/githubApp — auto-detected when empty - - name: authMode - value: "none" + # Optional: how often to poll the upstream for new commits + - name: pollInterval + value: "5m" ``` +The authentication profile is **auto-detected** from which fields are set — there is no explicit `authMode` selector. See [Authentication](#authentication) for details. + {{% alert title="Warning" color="warning" %}} -The above example uses plaintext metadata values. When using an authenticated `authMode` (`pat`, `ssh`, `githubApp`), reference credentials from a [secret store]({{% ref component-secrets.md %}}) instead. See [Authentication](#authentication) below. +The above example has no credentials (suitable for a public repo or a `file://` URL). When using PAT, SSH, or GitHub App authentication, reference credentials from a [secret store]({{% ref component-secrets.md %}}) instead of embedding them inline. See [Authentication](#authentication) below. {{% /alert %}} ## Spec metadata fields @@ -47,66 +46,70 @@ The above example uses plaintext metadata values. When using an authenticated `a | Field | Required | Details | Example | |-------|:--------:|---------|---------| -| `url` | Y | Git URL of the upstream repository. Supports `https://`, `ssh://`, `git@host:org/repo` (SCP-style), and `file://` schemes. URLs with the `git@` or `ssh://` prefix auto-select the SSH auth profile when `authMode` is empty. `http://` is rejected for any authenticated mode to prevent cleartext credential transmission. Embedding credentials inline (`https://user:tok@host/`) is also rejected — use the appropriate auth profile with a Dapr secret reference. | `"https://github.com/example/agent-config.git"` | +| `remoteUrl` | Y | Git URL of the upstream remote (the same value `git remote get-url origin` would return for a clone). Supports `https://`, `ssh://`, `git@host:org/repo` (SCP-style), and `file://` schemes. URLs with the `git@` or `ssh://` prefix auto-select the SSH auth profile. `http://` is rejected for any authenticated profile to prevent cleartext credential transmission. Embedding credentials inline (`https://user:tok@host/`) is also rejected — supply them via the appropriate auth profile field backed by a configured secret store. | `"https://github.com/example/agent-config.git"` | | `branch` | N | Branch to track. | `"main"` (default) | -| `path` | N | Subdirectory inside the repository to scope. Must be repo-relative (no leading `/` and no `..` components). | `"agents/weather"`, `"."` (default) | -| `depth` | N | Clone depth. `0` (default) performs a full clone. `go-git`'s shallow incremental fetch has known limitations; full clones are the safe choice for anything but trivial config repos. | `"0"` (default) | -| `pollInterval` | N | How often to poll the upstream for changes. Hard floor is `1s` for remote URLs; `file://` URLs may go down to `100ms`. Intervals below `5s` log a warning at startup. At the default `30s`, a single instance issues ~120 requests/h — well below GitHub's 5000/h PAT and 15000/h GitHub App limits. Multi-replica deployments multiply that rate (`effective_rate = 3600/pollInterval × replicas`). | `"30s"` (default) | -| `fetchTimeout` | N | Per-fetch timeout applied to fetch and ls-remote operations. | `"30s"` (default) | +| `path` | N | Subdirectory inside the repository to treat as the configuration root. Files outside this directory are not surfaced as configuration items. Must be repo-relative (no leading `/`, no `..` components, no segment may be `.git`). | `"agents/weather"`, `"."` (default) | +| `depth` | N | Clone depth. `0` (default) performs a full clone, matching git's default behaviour when no `--depth` is passed. `go-git`'s shallow incremental fetch has known limitations; full clones are the safe choice for anything but trivial config repos. | `"0"` (default) | +| `pollInterval` | N | How often to poll the upstream for changes. Hard floor is `1s` for remote URLs; `file://` URLs may go down to `100ms`. Intervals below `5s` log a warning at startup. Default `5m` gives plenty of head-room against provider rate limits even on multi-replica deployments — at `5m` × 10 replicas you issue 120 requests/h, well below GitHub's 5000/h PAT and 15000/h GitHub App limits. If the upstream responds with HTTP 429 or a transport-level rate-limit error, the poll loop pauses for `rateLimitRetryAfter` (or the server-supplied `Retry-After` when available) before its next tick. | `"5m"` (default) | +| `rateLimitRetryAfter` | N | How long the poll loop waits before its next tick when the upstream responds with a rate-limit error and no `Retry-After` header was supplied. | `"5m"` (default) | +| `fetchTimeout` | N | Per-fetch timeout applied to fetch operations. | `"30s"` (default) | | `includeHidden` | N | When `false` (default), files whose name begins with `.` are skipped during the worktree walk. The `.git` directory is **always** excluded regardless of this flag — credentials in `.git/config` (e.g. from an inline-credential URL) can never leak into the configuration items. | `"false"` (default) | | `maxFileSize` | N | Maximum per-file size in bytes that the walker will read into memory. Files larger than this are skipped with a warning. Protects the sidecar from OOM if a large blob is accidentally committed. | `"1048576"` (1 MiB default) | -| `snapshotCacheSize` | N | Number of past snapshots to retain in the LRU cache used as diff bases when computing per-subscriber update events. Higher values reduce over-emit churn when many subscribers are at slightly different commit positions. | `"4"` (default) | +| `snapshotCacheSize` | N | Number of past snapshots to retain in the LRU cache used as diff bases when computing per-subscriber update events. Higher values reduce over-emit churn when many subscribers are at slightly different delivered HEAD values. With the default `pollInterval` of `5m`, the default of `4` covers ~20 minutes of commit history for the most-stale subscriber. | `"4"` (default) | | `emitInitialState` | N | When `true` (default), `Subscribe` synchronously delivers the current snapshot to the handler before returning — callers don't need a separate `Get` + `Subscribe` pair. Set to `false` if the caller already has fresh state and would receive a duplicate. | `"true"` (default) | -| `mappingMode` | N | Strategy for mapping repository files to configuration items. Matching is case-insensitive. See [Mapping modes](#mapping-modes). | `"file"` (default), `"agentYaml"`, `"prompty"` | -| `authMode` | N | Authentication profile to use. Matching is case-insensitive. When empty, the component auto-detects: SSH-scheme URLs → `ssh`; `githubAppId` set → `githubApp`; `token` set → `pat`; otherwise `none`. See [Authentication](#authentication). | `"pat"` | +| `mappingMode` | N | Strategy for mapping repository files to configuration items. Matching is case-insensitive. Non-matching files in scope are a **hard error**. See [Mapping modes](#mapping-modes). | `"file"` (default), `"agentYaml"`, `"prompty"` | -### Personal Access Token (`authMode: pat`) +### Personal Access Token + +Use the PAT profile by setting `token`. Works with both GitHub classic PATs (`ghp_…`) and fine-grained PATs (`github_pat_…`). | Field | Required | Details | Example | |-------|:--------:|---------|---------| -| `token` | Y | Personal access token used to authenticate. For GitHub, use a fine-grained PAT with repository read access. Sent as the password in HTTP basic auth. | `"ghp_xxxxxxxxxxxx"` | -| `username` | N | Username sent with the PAT. Defaults to `"x-access-token"`, which is the GitHub-recommended placeholder. Other providers may require a real username. | `"x-access-token"` (default) | +| `token` | Y | Personal access token used to authenticate. Sent as the password in HTTP basic auth. Source from a configured secret store. | `"ghp_xxxxxxxxxxxx"` | +| `username` | N | Username sent with the token. Defaults to `"x-access-token"`, which is the GitHub-recommended placeholder. Other providers may require a real username. | `"x-access-token"` (default) | + +### SSH -### SSH (`authMode: ssh`) +Use the SSH profile by setting an SSH-scheme `remoteUrl` (`git@…` or `ssh://…`) plus a private key. | Field | Required | Details | Example | |-------|:--------:|---------|---------| -| `sshUser` | N | SSH user used when connecting. | `"git"` (default) | -| `sshPrivateKey` | Y* | PEM-encoded SSH private key. Either `sshPrivateKey` or `sshPrivateKeyPath` must be set. | `"-----BEGIN OPENSSH PRIVATE KEY-----\n..."` | -| `sshPrivateKeyPath` | Y* | Path to a PEM-encoded SSH private key on disk. Mutually exclusive with `sshPrivateKey`. | `"/var/run/secrets/git-ssh-key"` | -| `sshPassphrase` | N | Passphrase for the SSH private key, if encrypted. | | -| `sshKnownHosts` | Y** | Inline OpenSSH known_hosts entries used to verify the remote host key. Either `sshKnownHosts` or `sshKnownHostsPath` must be set unless `sshInsecureIgnoreHostKey` is `true`. Hostname is bound to key — a key registered for one host will not match another. | `"github.com ssh-rsa AAAA..."` | -| `sshKnownHostsPath` | Y** | Path to an OpenSSH `known_hosts` file on disk. | `"/etc/ssh/ssh_known_hosts"` | -| `sshInsecureIgnoreHostKey` | N | **DANGEROUS.** Disable SSH host key verification. Only use for local development or testing. When enabled, the component logs a warning at startup. Never enable in production; a MITM attacker can intercept configuration values. | `"false"` (default) | +| `user` | N | SSH user used when connecting. Defaults to `"git"`, matching the convention used by GitHub, GitLab, Bitbucket, and most self-hosted providers. | `"git"` (default) | +| `privateKey` | Y* | PEM-encoded SSH private key. Either `privateKey` or `privateKeyPath` must be set. | `"-----BEGIN OPENSSH PRIVATE KEY-----\n..."` | +| `privateKeyPath` | Y* | Path to a PEM-encoded SSH private key on disk. Mutually exclusive with `privateKey`. | `"/var/run/secrets/git-ssh-key"` | +| `passphrase` | N | Passphrase for the SSH private key, if encrypted. | | +| `knownHosts` | Y** | Inline OpenSSH known_hosts entries used to verify the remote host key. Either `knownHosts` or `knownHostsPath` must be set unless `insecureIgnoreHostKey` is `true`. Hostname is bound to key — a key registered for one host will not match another. | `"github.com ssh-rsa AAAA..."` | +| `knownHostsPath` | Y** | Path to an OpenSSH `known_hosts` file on disk. | `"/etc/ssh/ssh_known_hosts"` | +| `insecureIgnoreHostKey` | N | **DANGEROUS.** Disable SSH host key verification. Only use for local development or testing. When enabled, the component logs a warning at startup. Never enable in production; a MITM attacker can intercept configuration values. | `"false"` (default) | -`*` Exactly one of `sshPrivateKey` / `sshPrivateKeyPath` is required. -`**` Exactly one of `sshKnownHosts` / `sshKnownHostsPath` is required unless `sshInsecureIgnoreHostKey: true`. +`*` Exactly one of `privateKey` / `privateKeyPath` is required. +`**` Exactly one of `knownHosts` / `knownHostsPath` is required unless `insecureIgnoreHostKey: true`. -### GitHub App (`authMode: githubApp`) +### GitHub App -The component mints an RS256 JWT, exchanges it for a 1-hour installation token, and refreshes the token before expiry. +Use the GitHub App profile by setting `appId`. The component mints an RS256 JWT, exchanges it for a 1-hour installation token, and refreshes the token before expiry. On HTTP 429 (or 403 with rate-limit headers) the component honours `Retry-After` and retries once; persistent rate-limit responses trigger the poll-loop back-off. | Field | Required | Details | Example | |-------|:--------:|---------|---------| -| `githubAppId` | Y | Numeric GitHub App ID. | `"123456"` | -| `githubAppInstallationId` | Y | Numeric GitHub App installation ID for the target organisation or repository. | `"78901234"` | -| `githubAppPrivateKey` | Y* | PEM-encoded RSA private key for the GitHub App. Accepts both PKCS#1 (`RSA PRIVATE KEY`) and PKCS#8 (`PRIVATE KEY`) encodings — GitHub Apps may be downloaded in either form. | `"-----BEGIN RSA PRIVATE KEY-----\n..."` | -| `githubAppPrivateKeyPath` | Y* | Path to a PEM-encoded RSA private key on disk. Mutually exclusive with `githubAppPrivateKey`. | `"/var/run/secrets/github-app-key.pem"` | -| `githubAppApiBase` | N | Base URL of the GitHub API. Override for GitHub Enterprise Server. Must use `https://`. | `"https://api.github.com"` (default) | -| `githubAppRefreshSkew` | N | Refresh the installation token when it has less than this much time left before expiry. | `"5m"` (default) | +| `appId` | Y | Numeric GitHub App ID. | `"123456"` | +| `installationId` | Y | Numeric GitHub App installation ID for the target organisation or repository. | `"78901234"` | +| `privateKey` | Y* | PEM-encoded RSA private key for the GitHub App. Accepts both PKCS#1 (`RSA PRIVATE KEY`) and PKCS#8 (`PRIVATE KEY`) encodings — GitHub Apps may be downloaded in either form. | `"-----BEGIN RSA PRIVATE KEY-----\n..."` | +| `privateKeyPath` | Y* | Path to a PEM-encoded RSA private key on disk. Mutually exclusive with `privateKey`. | `"/var/run/secrets/github-app-key.pem"` | +| `apiBase` | N | Base URL of the GitHub API. Override for GitHub Enterprise Server. Must use `https://`. | `"https://api.github.com"` (default) | +| `refreshSkew` | N | Refresh the installation token when it has less than this much time left before expiry. | `"5m"` (default) | -`*` Exactly one of `githubAppPrivateKey` / `githubAppPrivateKeyPath` is required. +`*` Exactly one of `privateKey` / `privateKeyPath` is required. ## Authentication -The `authMode` metadata field selects an authentication profile. When `authMode` is empty, the component auto-detects in this order: +There is **no explicit `authMode` selector** — the active profile is inferred from which fields are set: -1. URL begins with `git@` or `ssh://` → `ssh` -2. `githubAppId` is set → `githubApp` -3. `token` is set → `pat` -4. Otherwise → `none` +1. `appId` set → **GitHub App** profile. +2. URL begins with `git@` or `ssh://` → **SSH** profile. +3. `token` set → **PAT** profile. +4. Otherwise → no auth (public HTTPS or local `file://`). -Sensitive fields (`token`, `sshPrivateKey`, `sshPassphrase`, `githubAppPrivateKey`) should be sourced from a [Dapr secret store]({{% ref component-secrets.md %}}). Embedding credentials directly in the URL (e.g. `https://user:tok@host/repo`) is rejected at component init — operators must use a structured auth profile. +Sensitive fields (`token`, `privateKey`, `passphrase`) should be sourced from a configured secret store via `secretKeyRef`. Embedding credentials directly in the URL (e.g. `https://user:tok@host/repo`) is rejected at component init — operators must use a structured auth profile. The `auth.secretStore` field at the bottom of each example below names the [configured secret store component]({{% ref supported-secret-stores %}}) Dapr should use to resolve the `secretKeyRef` entries in `metadata`. When running in Kubernetes with a Kubernetes secret store, this field defaults to `kubernetes` and can be omitted. See [How-To: Reference secrets in components]({{% ref component-secrets.md %}}) for details. @@ -121,10 +124,8 @@ spec: type: configuration.git version: v1 metadata: - - name: url + - name: remoteUrl value: "https://github.com/example/private-config.git" - - name: authMode - value: "pat" - name: token secretKeyRef: name: github-pat @@ -146,15 +147,13 @@ spec: type: configuration.git version: v1 metadata: - - name: url + - name: remoteUrl value: "git@github.com:example/private-config.git" - - name: authMode - value: "ssh" - - name: sshPrivateKey + - name: privateKey secretKeyRef: name: git-ssh-deploy-key key: privateKey - - name: sshKnownHosts + - name: knownHosts secretKeyRef: name: git-ssh-known-hosts key: knownHosts @@ -175,15 +174,13 @@ spec: type: configuration.git version: v1 metadata: - - name: url + - name: remoteUrl value: "https://github.com/example/private-config.git" - - name: authMode - value: "githubApp" - - name: githubAppId + - name: appId value: "123456" - - name: githubAppInstallationId + - name: installationId value: "78901234" - - name: githubAppPrivateKey + - name: privateKey secretKeyRef: name: github-app-key key: privateKey @@ -195,7 +192,7 @@ auth: ## Mapping modes -The `mappingMode` field selects how files in the repository become configuration items. Matching is case-insensitive. +The `mappingMode` field selects how files in the repository become configuration items. Matching is case-insensitive. **Non-matching files in the configured scope cause `Init` to fail** — if your scope contains a mix of file types, either narrow `path` to the homogeneous subset or use `mappingMode: file`. ### `file` (default) @@ -207,11 +204,11 @@ repo/ └── agents/weather/agent_goal.txt → key "agents/weather/agent_goal.txt" ``` -This mode is the recommended choice when the consumer expects scalar configuration keys. +Recommended when the consumer expects scalar configuration keys. ### `agentYaml` -Each `*.yaml`, `*.yml`, or `*.json` file is parsed as a flat top-level map. Each top-level field becomes a key prefixed by the filename stem with directory separators replaced by `_`. Non-YAML/JSON files are silently skipped. +Each `*.yaml`, `*.yml`, or `*.json` file is parsed as a flat top-level map. Each top-level field becomes a key prefixed by the filename stem with directory separators replaced by `_`. **Non-YAML/JSON files in scope cause `Init` to fail.** ```yaml # repo/agents/weather.yaml @@ -234,7 +231,7 @@ Non-scalar field values round-trip via YAML re-serialisation — consumers can r ### `prompty` -Each `*.prompty` file's YAML frontmatter and body are split. Frontmatter fields produce `/` keys (using the same directory-aware stem rules as `agentYaml`); the body is emitted as `/agent_system_prompt`. Non-`.prompty` files are skipped. +Each `*.prompty` file's YAML frontmatter and body are split. Frontmatter fields produce `/` keys (using the same directory-aware stem rules as `agentYaml`); the body is emitted as `/agent_system_prompt`. See the [Prompty spec](https://github.com/microsoft/prompty) for the file format. **Non-`.prompty` files in scope cause `Init` to fail.** ```text --- @@ -265,7 +262,7 @@ On `Init`, the component clones the upstream repository into a temporary working 3. Otherwise, hard-reset the worktree to the new tip, walk the files under `path`, run the configured mapping strategy, and install the new snapshot. 4. For each active subscriber, compute the diff against the snapshot the subscriber last saw and dispatch a notification. -`Get` returns the most-recently-polled snapshot and may be up to `pollInterval` old. It does not contact the upstream — use `Subscribe` to receive change notifications in near real-time. +`Get` returns the most-recently-polled snapshot and may be up to `pollInterval` old. It does not contact the upstream — use `Subscribe` to receive change notifications. ### Subscriptions @@ -291,12 +288,16 @@ The `deleted: true` sentinel distinguishes a removed key from a key set to the e The version on every emitted item is the short (7-character) commit SHA of the upstream tip at the time of the snapshot. +### Rate limiting + +On HTTP 429 from the GitHub API (used by the GitHub App installation-token exchange), or a transport-level rate-limit error from `go-git`, the poll loop pauses for `rateLimitRetryAfter` (default `5m`) — or the server-supplied `Retry-After` when present — before its next tick. The component never retries a rate-limited response in a tight loop. + ### Security considerations -- `http://` URLs are rejected when any authenticated mode is in use to prevent cleartext credential transmission. Use `https://`, `ssh://`, or `file://`. -- Inline credentials in the URL (`https://user:token@host/repo`) are rejected. Always use a structured auth profile sourced from a Dapr secret store. -- The `.git` directory is always excluded from the worktree walk, regardless of `includeHidden`. This prevents the remote URL and any credentials stored in `.git/config` from leaking into configuration items. -- `sshInsecureIgnoreHostKey: true` is supported for development but loud-logged at startup. Production deployments must always provide `sshKnownHosts` or `sshKnownHostsPath`. +- `http://` URLs are rejected when any authenticated profile is in use to prevent cleartext credential transmission. Use `https://`, `ssh://`, or `file://`. +- Inline credentials in the URL (`https://user:token@host/repo`) are rejected. Always use a structured auth profile sourced from a configured secret store. +- The `.git` directory is always excluded from the worktree walk, regardless of `includeHidden`. This prevents the remote URL and any credentials stored in `.git/config` from leaking into configuration items. A `path` containing a `.git` segment is rejected at `Init`. +- `insecureIgnoreHostKey: true` is supported for development but loud-logged at startup. Production deployments must always provide `knownHosts` or `knownHostsPath`. - Polling rate cumulatively counts against the git provider's rate limit. Multi-replica deployments multiply request volume — see the `pollInterval` row in the metadata table for the calculation. {{% alert title="Note" color="primary" %}} From 29b9e036018538e1e16b04dc6094d2c30a9b4f7f Mon Sep 17 00:00:00 2001 From: Casper Nielsen Date: Tue, 12 May 2026 15:08:33 +0200 Subject: [PATCH 3/3] fix: address review comments Signed-off-by: Casper Nielsen --- .../git-configuration-store.md | 102 ++++++++++-------- 1 file changed, 56 insertions(+), 46 deletions(-) diff --git a/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md b/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md index 4f19b3220d0..efdbe5c8514 100644 --- a/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md +++ b/daprdocs/content/en/reference/components-reference/supported-configuration-stores/git-configuration-store.md @@ -7,6 +7,8 @@ description: Detailed information on the Git configuration store component ## Component format +The Git configuration store backs Dapr's configuration API with the contents of a git repository: each `Get`/`Subscribe` resolves against the most-recently polled snapshot of the configured branch. The three operator-facing knobs are the upstream location (`remoteUrl`), how often to poll for new commits (`pollInterval`), and how repository files are mapped to configuration items (`mappingMode`). + To set up a Git configuration store, create a component of type `configuration.git`. See [this guide]({{% ref "howto-manage-configuration.md#configure-a-dapr-configuration-store" %}}) on how to create and apply a configuration store configuration. ```yaml @@ -26,12 +28,12 @@ spec: # Optional: subdirectory inside the repo to scope - name: path value: "." - # Optional: file/agentYaml/prompty - - name: mappingMode - value: "file" # Optional: how often to poll the upstream for new commits - name: pollInterval value: "5m" + # Optional: how repo files become config items — file | agentYaml | prompty + - name: mappingMode + value: "file" ``` The authentication profile is **auto-detected** from which fields are set — there is no explicit `authMode` selector. See [Authentication](#authentication) for details. @@ -42,52 +44,37 @@ The above example has no credentials (suitable for a public repo or a `file://` ## Spec metadata fields -### General - -| Field | Required | Details | Example | -|-------|:--------:|---------|---------| -| `remoteUrl` | Y | Git URL of the upstream remote (the same value `git remote get-url origin` would return for a clone). Supports `https://`, `ssh://`, `git@host:org/repo` (SCP-style), and `file://` schemes. URLs with the `git@` or `ssh://` prefix auto-select the SSH auth profile. `http://` is rejected for any authenticated profile to prevent cleartext credential transmission. Embedding credentials inline (`https://user:tok@host/`) is also rejected — supply them via the appropriate auth profile field backed by a configured secret store. | `"https://github.com/example/agent-config.git"` | -| `branch` | N | Branch to track. | `"main"` (default) | -| `path` | N | Subdirectory inside the repository to treat as the configuration root. Files outside this directory are not surfaced as configuration items. Must be repo-relative (no leading `/`, no `..` components, no segment may be `.git`). | `"agents/weather"`, `"."` (default) | -| `depth` | N | Clone depth. `0` (default) performs a full clone, matching git's default behaviour when no `--depth` is passed. `go-git`'s shallow incremental fetch has known limitations; full clones are the safe choice for anything but trivial config repos. | `"0"` (default) | -| `pollInterval` | N | How often to poll the upstream for changes. Hard floor is `1s` for remote URLs; `file://` URLs may go down to `100ms`. Intervals below `5s` log a warning at startup. Default `5m` gives plenty of head-room against provider rate limits even on multi-replica deployments — at `5m` × 10 replicas you issue 120 requests/h, well below GitHub's 5000/h PAT and 15000/h GitHub App limits. If the upstream responds with HTTP 429 or a transport-level rate-limit error, the poll loop pauses for `rateLimitRetryAfter` (or the server-supplied `Retry-After` when available) before its next tick. | `"5m"` (default) | -| `rateLimitRetryAfter` | N | How long the poll loop waits before its next tick when the upstream responds with a rate-limit error and no `Retry-After` header was supplied. | `"5m"` (default) | -| `fetchTimeout` | N | Per-fetch timeout applied to fetch operations. | `"30s"` (default) | -| `includeHidden` | N | When `false` (default), files whose name begins with `.` are skipped during the worktree walk. The `.git` directory is **always** excluded regardless of this flag — credentials in `.git/config` (e.g. from an inline-credential URL) can never leak into the configuration items. | `"false"` (default) | -| `maxFileSize` | N | Maximum per-file size in bytes that the walker will read into memory. Files larger than this are skipped with a warning. Protects the sidecar from OOM if a large blob is accidentally committed. | `"1048576"` (1 MiB default) | -| `snapshotCacheSize` | N | Number of past snapshots to retain in the LRU cache used as diff bases when computing per-subscriber update events. Higher values reduce over-emit churn when many subscribers are at slightly different delivered HEAD values. With the default `pollInterval` of `5m`, the default of `4` covers ~20 minutes of commit history for the most-stale subscriber. | `"4"` (default) | -| `emitInitialState` | N | When `true` (default), `Subscribe` synchronously delivers the current snapshot to the handler before returning — callers don't need a separate `Get` + `Subscribe` pair. Set to `false` if the caller already has fresh state and would receive a duplicate. | `"true"` (default) | -| `mappingMode` | N | Strategy for mapping repository files to configuration items. Matching is case-insensitive. Non-matching files in scope are a **hard error**. See [Mapping modes](#mapping-modes). | `"file"` (default), `"agentYaml"`, `"prompty"` | +The component infers which authentication profile is active from the fields you set (see [Authentication](#authentication)). The auth-profile tables come first; the general metadata fields apply to every profile. ### Personal Access Token -Use the PAT profile by setting `token`. Works with both GitHub classic PATs (`ghp_…`) and fine-grained PATs (`github_pat_…`). +Selected when `token` is set and no SSH-scheme URL or `appId` is present. Works with both GitHub classic PATs (`ghp_…`) and fine-grained PATs (`github_pat_…`); the token is sent as the password in HTTP basic auth. | Field | Required | Details | Example | |-------|:--------:|---------|---------| -| `token` | Y | Personal access token used to authenticate. Sent as the password in HTTP basic auth. Source from a configured secret store. | `"ghp_xxxxxxxxxxxx"` | -| `username` | N | Username sent with the token. Defaults to `"x-access-token"`, which is the GitHub-recommended placeholder. Other providers may require a real username. | `"x-access-token"` (default) | +| `token` | Y | Personal access token used to authenticate. | `"ghp_xxxxxxxxxxxx"` | +| `username` | N | Username sent with the token. Defaults to `"x-access-token"`, which GitHub recommends when using a PAT. Other providers may require a real username. | `"x-access-token"` (default) | ### SSH -Use the SSH profile by setting an SSH-scheme `remoteUrl` (`git@…` or `ssh://…`) plus a private key. +Selected when `remoteUrl` begins with `git@` or `ssh://`. | Field | Required | Details | Example | |-------|:--------:|---------|---------| -| `user` | N | SSH user used when connecting. Defaults to `"git"`, matching the convention used by GitHub, GitLab, Bitbucket, and most self-hosted providers. | `"git"` (default) | -| `privateKey` | Y* | PEM-encoded SSH private key. Either `privateKey` or `privateKeyPath` must be set. | `"-----BEGIN OPENSSH PRIVATE KEY-----\n..."` | +| `privateKey` | Y* | PEM-encoded SSH private key. | `"-----BEGIN OPENSSH PRIVATE KEY-----\n..."` | | `privateKeyPath` | Y* | Path to a PEM-encoded SSH private key on disk. Mutually exclusive with `privateKey`. | `"/var/run/secrets/git-ssh-key"` | | `passphrase` | N | Passphrase for the SSH private key, if encrypted. | | -| `knownHosts` | Y** | Inline OpenSSH known_hosts entries used to verify the remote host key. Either `knownHosts` or `knownHostsPath` must be set unless `insecureIgnoreHostKey` is `true`. Hostname is bound to key — a key registered for one host will not match another. | `"github.com ssh-rsa AAAA..."` | -| `knownHostsPath` | Y** | Path to an OpenSSH `known_hosts` file on disk. | `"/etc/ssh/ssh_known_hosts"` | -| `insecureIgnoreHostKey` | N | **DANGEROUS.** Disable SSH host key verification. Only use for local development or testing. When enabled, the component logs a warning at startup. Never enable in production; a MITM attacker can intercept configuration values. | `"false"` (default) | +| `user` | N | SSH user used when connecting. | `"git"` (default) | +| `knownHosts` | Y** | Inline OpenSSH `known_hosts` entries used to verify the remote host key. Hostname is bound to key — a key registered for one host will not match another. | `"github.com ssh-rsa AAAA..."` | +| `knownHostsPath` | Y** | Path to an OpenSSH `known_hosts` file on disk. Mutually exclusive with `knownHosts`. | `"/etc/ssh/ssh_known_hosts"` | +| `insecureIgnoreHostKey` | N | **DANGEROUS.** Disable SSH host-key verification. A loud warning is logged at startup when enabled; never use in production — a MITM attacker can intercept configuration values. | `"false"` (default) | `*` Exactly one of `privateKey` / `privateKeyPath` is required. `**` Exactly one of `knownHosts` / `knownHostsPath` is required unless `insecureIgnoreHostKey: true`. ### GitHub App -Use the GitHub App profile by setting `appId`. The component mints an RS256 JWT, exchanges it for a 1-hour installation token, and refreshes the token before expiry. On HTTP 429 (or 403 with rate-limit headers) the component honours `Retry-After` and retries once; persistent rate-limit responses trigger the poll-loop back-off. +Selected when `appId` is set. The component mints an RS256 JWT, exchanges it for a 1-hour installation token, and refreshes the token before expiry. | Field | Required | Details | Example | |-------|:--------:|---------|---------| @@ -100,16 +87,33 @@ Use the GitHub App profile by setting `appId`. The component mints an RS256 JWT, `*` Exactly one of `privateKey` / `privateKeyPath` is required. +### General + +| Field | Required | Details | Example | +|-------|:--------:|---------|---------| +| `remoteUrl` | Y | Git URL of the upstream repository — the same value `git remote get-url origin` would return for a clone. Supports `https://`, `ssh://`, `git@host:org/repo` (SCP-style), and `file://` schemes. `http://` is rejected when an authenticated profile is in use to prevent cleartext credential transmission. Embedding credentials inline (`https://user:tok@host/`) is rejected — supply them via the appropriate auth profile field backed by a Dapr secret reference. | `"https://github.com/example/agent-config.git"` | +| `branch` | N | Branch to track. | `"main"` (default) | +| `path` | N | Subdirectory inside the repository to treat as the configuration root. Files outside this directory are not surfaced. Must be repo-relative (no leading `/`, no `..` components, no segment equal to `.git`). | `"agents/weather"`, `"."` (default) | +| `depth` | N | Clone depth. `0` (default) performs a full clone. `go-git`'s shallow incremental fetch has known limitations; full clones are the safe choice for anything but trivial config repos. | `"0"` (default) | +| `pollInterval` | N | How often to poll the upstream for changes. Hard floor is `1s` for remote URLs; `file://` URLs may go down to `100ms`. Intervals below `5s` log a warning at startup. At the default `5m`, a single instance issues 12 requests/h — well below GitHub's 5000/h PAT and 15000/h GitHub App limits, with plenty of headroom for multi-replica deployments. | `"5m"` (default) | +| `rateLimitRetryAfter` | N | How long the poll loop pauses before its next tick after the upstream responds with a rate-limit error and no `Retry-After` header was supplied. Tune this if you're hitting secondary rate limits on a busy multi-replica deployment. | `"5m"` (default) | +| `fetchTimeout` | N | Per-fetch timeout applied to fetch operations. | `"30s"` (default) | +| `includeHidden` | N | When `false` (default), files whose name begins with `.` are skipped during the worktree walk. The `.git` directory is **always** excluded regardless of this flag — credentials in `.git/config` (e.g. from an inline-credential URL) can never leak into configuration items. | `"false"` (default) | +| `maxFileSize` | N | Maximum per-file size in bytes that the walker will read into memory. Files larger than this are skipped with a warning. Protects the sidecar from OOM if a large blob is accidentally committed. | `"1048576"` (1 MiB default) | +| `snapshotCacheSize` | N | Number of past snapshots to retain in the LRU cache used as diff bases when computing per-subscriber update events. Higher values reduce over-emit churn when many subscribers are at slightly different commit positions. | `"4"` (default) | +| `emitInitialState` | N | When `true` (default), `Subscribe` synchronously delivers the current snapshot to the handler before returning — callers don't need a separate `Get` + `Subscribe` pair. Set to `false` if the caller already has fresh state and would receive a duplicate. | `"true"` (default) | +| `mappingMode` | N | Strategy for mapping repository files to configuration items. Matching is case-insensitive. See [Mapping modes](#mapping-modes). | `"file"` (default), `"agentYaml"`, `"prompty"` | + ## Authentication -There is **no explicit `authMode` selector** — the active profile is inferred from which fields are set: +There is no explicit auth-mode selector — the active profile is inferred from which fields are set: -1. `appId` set → **GitHub App** profile. -2. URL begins with `git@` or `ssh://` → **SSH** profile. -3. `token` set → **PAT** profile. +1. `appId` is set → **GitHub App**. +2. `remoteUrl` begins with `git@` or `ssh://` → **SSH**. +3. `token` is set → **Personal Access Token**. 4. Otherwise → no auth (public HTTPS or local `file://`). -Sensitive fields (`token`, `privateKey`, `passphrase`) should be sourced from a configured secret store via `secretKeyRef`. Embedding credentials directly in the URL (e.g. `https://user:tok@host/repo`) is rejected at component init — operators must use a structured auth profile. +Fields marked as sensitive in the [component metadata schema](https://github.com/dapr/components-contrib/blob/main/configuration/git/metadata.yaml) (private keys, tokens, passphrases) should be sourced from a [Dapr secret store]({{% ref component-secrets.md %}}). Embedding credentials directly in the URL (e.g. `https://user:tok@host/repo`) is rejected at component init — operators must use a structured auth profile. The `auth.secretStore` field at the bottom of each example below names the [configured secret store component]({{% ref supported-secret-stores %}}) Dapr should use to resolve the `secretKeyRef` entries in `metadata`. When running in Kubernetes with a Kubernetes secret store, this field defaults to `kubernetes` and can be omitted. See [How-To: Reference secrets in components]({{% ref component-secrets.md %}}) for details. @@ -192,7 +196,7 @@ auth: ## Mapping modes -The `mappingMode` field selects how files in the repository become configuration items. Matching is case-insensitive. **Non-matching files in the configured scope cause `Init` to fail** — if your scope contains a mix of file types, either narrow `path` to the homogeneous subset or use `mappingMode: file`. +The `mappingMode` field selects how files in the repository become configuration items. Matching is case-insensitive. Under `agentYaml` and `prompty`, **any file in scope with an unrecognised extension causes `Init` to fail** — narrow `path` to a homogeneous subdirectory or use `mappingMode: file` for mixed content. ### `file` (default) @@ -204,11 +208,15 @@ repo/ └── agents/weather/agent_goal.txt → key "agents/weather/agent_goal.txt" ``` -Recommended when the consumer expects scalar configuration keys. +This mode is the recommended choice when the consumer expects scalar configuration keys. + +Keys are not length-limited by the component; very long repository paths produce equivalently long keys. If your consumer (or the Configuration API transport) enforces a key-length limit, narrow `path` or use a flatter directory layout. ### `agentYaml` -Each `*.yaml`, `*.yml`, or `*.json` file is parsed as a flat top-level map. Each top-level field becomes a key prefixed by the filename stem with directory separators replaced by `_`. **Non-YAML/JSON files in scope cause `Init` to fail.** +Accepted file extensions: `*.yaml`, `*.yml`, `*.json`. Any other file in scope (including `*.toml`) causes `Init` to fail. + +Each accepted file is parsed as a flat top-level map. Each top-level field becomes a key prefixed by the filename stem with directory separators replaced by `_`. ```yaml # repo/agents/weather.yaml @@ -231,7 +239,9 @@ Non-scalar field values round-trip via YAML re-serialisation — consumers can r ### `prompty` -Each `*.prompty` file's YAML frontmatter and body are split. Frontmatter fields produce `/` keys (using the same directory-aware stem rules as `agentYaml`); the body is emitted as `/agent_system_prompt`. See the [Prompty spec](https://github.com/microsoft/prompty) for the file format. **Non-`.prompty` files in scope cause `Init` to fail.** +Accepted file extensions: `*.prompty`. Any other file in scope causes `Init` to fail. See the [Prompty spec](https://github.com/microsoft/prompty) for the file format. + +Each `*.prompty` file's YAML frontmatter and body are split. Frontmatter fields produce `/` keys (same directory-aware stem rules as `agentYaml`); the body is emitted as `/agent_system_prompt`. ```text --- @@ -259,10 +269,10 @@ On `Init`, the component clones the upstream repository into a temporary working 1. Fetch the configured branch from the upstream. 2. If the remote tracking ref hasn't moved, do nothing. -3. Otherwise, hard-reset the worktree to the new tip, walk the files under `path`, run the configured mapping strategy, and install the new snapshot. +3. Otherwise, hard-reset the worktree to the new tip — files that were removed upstream are dropped from the snapshot and emit deletion notifications to subscribers (see [Deletion semantics](#deletion-semantics)). No partial / additive update path exists. Walk the files under `path`, run the configured mapping strategy, and install the new snapshot. 4. For each active subscriber, compute the diff against the snapshot the subscriber last saw and dispatch a notification. -`Get` returns the most-recently-polled snapshot and may be up to `pollInterval` old. It does not contact the upstream — use `Subscribe` to receive change notifications. +`Get` returns the most-recently-polled snapshot and may be up to `pollInterval` old. It does not contact the upstream — use `Subscribe` to receive change notifications in near real-time. ### Subscriptions @@ -288,17 +298,17 @@ The `deleted: true` sentinel distinguishes a removed key from a key set to the e The version on every emitted item is the short (7-character) commit SHA of the upstream tip at the time of the snapshot. -### Rate limiting +### Rate-limit handling -On HTTP 429 from the GitHub API (used by the GitHub App installation-token exchange), or a transport-level rate-limit error from `go-git`, the poll loop pauses for `rateLimitRetryAfter` (default `5m`) — or the server-supplied `Retry-After` when present — before its next tick. The component never retries a rate-limited response in a tight loop. +On HTTP 429 from the GitHub API (used by the GitHub App installation-token exchange), or a transport-level rate-limit error from `go-git`, the poll loop pauses for `rateLimitRetryAfter` — or the server-supplied `Retry-After` value when present — before the next tick. The default of `5m` leaves headroom against secondary rate limits even on multi-replica deployments. ### Security considerations -- `http://` URLs are rejected when any authenticated profile is in use to prevent cleartext credential transmission. Use `https://`, `ssh://`, or `file://`. -- Inline credentials in the URL (`https://user:token@host/repo`) are rejected. Always use a structured auth profile sourced from a configured secret store. +- `http://` URLs are rejected when an authenticated profile is in use to prevent cleartext credential transmission. Use `https://`, `ssh://`, or `file://`. +- Inline credentials in the URL (`https://user:token@host/repo`) are rejected. Always use a structured auth profile sourced from a Dapr secret store. - The `.git` directory is always excluded from the worktree walk, regardless of `includeHidden`. This prevents the remote URL and any credentials stored in `.git/config` from leaking into configuration items. A `path` containing a `.git` segment is rejected at `Init`. - `insecureIgnoreHostKey: true` is supported for development but loud-logged at startup. Production deployments must always provide `knownHosts` or `knownHostsPath`. -- Polling rate cumulatively counts against the git provider's rate limit. Multi-replica deployments multiply request volume — see the `pollInterval` row in the metadata table for the calculation. +- Polling rate cumulatively counts against the git provider's rate limit. Multi-replica deployments multiply request volume; the `rateLimitRetryAfter` field controls back-off after a 429. {{% alert title="Note" color="primary" %}} The component is **read-only**. It never writes to the upstream repository. Configuration changes must be made by committing to the repo through your normal git workflow (PR review, branch protection, etc.). @@ -306,7 +316,7 @@ The component is **read-only**. It never writes to the upstream repository. Conf ## Limitations -- **Single GitHub App installation per component.** Multi-tenant routing (different repos via different installations on the same component) is not supported. +- **Single GitHub App installation per component.** The schema exposes one `appId` and one `installationId`; multi-tenant routing (different repos via different installations on the same component) is not supported. ## Related links