feat(worker): export GMS host/port/protocol; bump chart to 0.0.42#70
Merged
Conversation
The executor image's wait_for_gms_ready startup gate reads
DATAHUB_GMS_HOST/PORT/PROTOCOL, but the chart only exported
DATAHUB_GMS_URL. With nothing set, the gate fell back to a hardcoded
"datahub-gms:8080" — which fails DNS in any namespace whose GMS
release has a non-empty name prefix (e.g. "dh-datahub-gms").
Export host/port/protocol from values, defaulting to the existing
fallback so current users aren't affected. Deployments with a
prefixed GMS service set the override:
global:
datahub:
gms:
host: dh-datahub-gms
Removed comments explaining host/port/protocol usage.
skrydal
reviewed
May 18, 2026
PR feedback: rather than requiring customers to specify host/port/protocol explicitly, derive them from gms.url so self-contained installs only need the URL. Behavior: - If none of host/port/protocol are set and gms.url is set, parse the URL to fill all three. - If any of host/port/protocol is explicitly set, the trio falls back to literal internal defaults (datahub-gms/8080/http) for unset fields. This prevents partial-override silent corruption where, e.g., setting host: dh-datahub-gms while leaving port unset would otherwise inherit port 443 from an external LB URL. - If gms.url is also unset, all three fall back to the existing literal defaults — preserving backward compatibility with chart 0.0.41.
skrydal
approved these changes
May 18, 2026
Contributor
skrydal
left a comment
There was a problem hiding this comment.
I wish we had a way to test this
Member
Author
|
I did some testing via scripts with different urls |
rajatoss
added a commit
that referenced
this pull request
May 19, 2026
PR #70 (chart 0.0.42) introduced URL-parsing that always exported DATAHUB_GMS_HOST / PORT / PROTOCOL derived from gms.url. This broke the DataHub Python SDK: `_get_config_from_env()` in datahub.cli.config_utils short-circuits on `if port is not None` and rebuilds the URL from HOST:PORT, dropping any path component. For LB-routed URLs like `https://dev07.acryl.io/gms`, the reconstructed URL became `https://dev07.acryl.io:443`, which hit the frontend and failed ingestion with "you seem to have connected to the frontend service instead of the GMS endpoint". The HOST/PORT/PROTOCOL env vars were only ever a workaround for an image-side bug (the startup script in pr8856 / v1.1.0rc1 hardcoded "datahub-gms" instead of reading DATAHUB_GMS_URL). The proper fix landed in acryldata/datahub-fork#9666 (now backported to releases/v1.1.0 via #9668) — the startup script now reads DATAHUB_GMS_URL directly. With that image-side fix in place, the chart no longer needs to set HOST/PORT/PROTOCOL at all. This chart change reverts to the 0.0.41 behavior (only emit DATAHUB_GMS_URL) and bumps the version to 0.0.43. Customers who deployed 0.0.42 with implicit URL-parsing get fixed ingestion on upgrade. Customers who had set explicit gms.host/port/protocol in 0.0.42 (unlikely — the chart only published briefly) can use the existing extraEnvs: escape hatch if they need those env vars.
rajatoss
added a commit
that referenced
this pull request
May 19, 2026
PR #70 (chart 0.0.42) introduced URL-parsing that always exported DATAHUB_GMS_HOST / PORT / PROTOCOL derived from gms.url. This broke the DataHub Python SDK: `_get_config_from_env()` in datahub.cli.config_utils short-circuits on `if port is not None` and rebuilds the URL from HOST:PORT, dropping any path component. For LB-routed URLs like `https://dev07.acryl.io/gms`, the reconstructed URL became `https://dev07.acryl.io:443`, which hit the frontend and failed ingestion with "you seem to have connected to the frontend service instead of the GMS endpoint". The HOST/PORT/PROTOCOL env vars were only ever a workaround for an image-side bug (the startup script in pr8856 / v1.1.0rc1 hardcoded "datahub-gms" instead of reading DATAHUB_GMS_URL). The proper fix landed in acryldata/datahub-fork#9666 (now backported to releases/v1.1.0 via #9668) — the startup script now reads DATAHUB_GMS_URL directly. With that image-side fix in place, the chart no longer needs to set HOST/PORT/PROTOCOL at all. This chart change reverts to the 0.0.41 behavior (only emit DATAHUB_GMS_URL) and bumps the version to 0.0.43. Customers who deployed 0.0.42 with implicit URL-parsing get fixed ingestion on upgrade. Customers who had set explicit gms.host/port/protocol in 0.0.42 (unlikely — the chart only published briefly) can use the existing extraEnvs: escape hatch if they need those env vars.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Export
DATAHUB_GMS_HOST/PORT/PROTOCOLenv vars from thedatahub-executor-workerchart so the executor image's startup readiness gate (wait_for_gms_readyinstart.sh) can find GMS in any deployment, with sensible defaults derived fromgms.url.Problem
The executor image's
wait_for_gms_readyshell function (the Wolfi-image replacement fordockerize) readsDATAHUB_GMS_HOST/PORT/PROTOCOL. The chart only exportedDATAHUB_GMS_URL. With nothing else set, the gate fell back to a hardcodedhttp://datahub-gms:8080/healthand timed out after 240s in any namespace where the actual Service is something likedh-datahub-gms.Observed on
f7700a3ff0-dev07(pr8856 image, chart 0.0.41): worker pods crashlooping withWaiting for GMS at http://datahub-gms:8080/health (timeout 240s)...while the actual Service isdh-datahub-gms.Behavior
Precedence per field (
host,port,protocol):datahub-gms/8080/http). The trio is treated as "all-or-nothing" — touching one means the user is describing the internal Service, so other defaults are also internal-friendly. This prevents partial-override silent corruption (e.g., settinghost: dh-datahub-gmsagainst an external LB URL inheritingport: 443).gms.urlis set → derive all three by parsing the URL (sprigurlParse). Self-contained installs (gms.url: http://datahub-gms:8080) need no extra config.datahub-gms/8080/http— matches the script's previous hardcoded fallback, preserving backward compatibility with chart 0.0.41.Verification
Rendered against 16 edge cases (
helm templatelocally), covering:Companion change
acryldata/datahub-fork#9666makesstart.shalso derive these fromDATAHUB_GMS_URLas a fallback — defense in depth so future regressions can't recur. That change has merged toacryl-main; backport toreleases/v1.1.0is on branchcherry-pick/9666-to-releases-v1.1.0.