fix(worker): revert URL-parsing from 0.0.42; bump to 0.0.43#71
Merged
Conversation
PR #70 (chart 0.0.42) introduced URL-parsing that always exported DATAHUB_GMS_HOST / PORT / PROTOCOL derived from gms.url. This broke the DataHub Python SDK: `_get_config_from_env()` in datahub.cli.config_utils short-circuits on `if port is not None` and rebuilds the URL from HOST:PORT, dropping any path component. For LB-routed URLs like `https://dev07.acryl.io/gms`, the reconstructed URL became `https://dev07.acryl.io:443`, which hit the frontend and failed ingestion with "you seem to have connected to the frontend service instead of the GMS endpoint". The HOST/PORT/PROTOCOL env vars were only ever a workaround for an image-side bug (the startup script in pr8856 / v1.1.0rc1 hardcoded "datahub-gms" instead of reading DATAHUB_GMS_URL). The proper fix landed in acryldata/datahub-fork#9666 (now backported to releases/v1.1.0 via #9668) — the startup script now reads DATAHUB_GMS_URL directly. With that image-side fix in place, the chart no longer needs to set HOST/PORT/PROTOCOL at all. This chart change reverts to the 0.0.41 behavior (only emit DATAHUB_GMS_URL) and bumps the version to 0.0.43. Customers who deployed 0.0.42 with implicit URL-parsing get fixed ingestion on upgrade. Customers who had set explicit gms.host/port/protocol in 0.0.42 (unlikely — the chart only published briefly) can use the existing extraEnvs: escape hatch if they need those env vars.
6c1211f to
07805d4
Compare
skrydal
approved these changes
May 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Revert the URL-parsing logic introduced in #70 (chart 0.0.42). Chart 0.0.43 is functionally identical to 0.0.41 — only emits
DATAHUB_GMS_URL. The proper fix for the original problem lives in the image, not the chart.What 0.0.42 broke
PR #70 made the chart auto-derive and unconditionally export
DATAHUB_GMS_HOST/PORT/PROTOCOLfromgms.url. This broke the DataHub Python SDK in deployments with a path-prefixed URL.The SDK's
_get_config_from_env()has this branch:For
gms.url: https://dev07.acryl.io/gms, the chart exportedPORT=443, triggering this branch. The SDK reconstructedhttps://dev07.acryl.io:443— losing the/gmspath — which hit the frontend, not GMS.Why we don't need the env vars at all
The original motivation for setting
DATAHUB_GMS_HOST/PORT/PROTOCOLin chart 0.0.42 was to feed the image'swait_for_gms_readystartup readiness gate, which (in pr8856 / v1.1.0rc1) hardcodeddatahub-gmsas the host and failed DNS in namespaces with a Helm release prefix (e.g.dh-datahub-gms).That bug has now been fixed at the image level:
start.shnow readsDATAHUB_GMS_URLdirectly. (Merged toacryl-main.)releases/v1.1.0. (Merged.)Once the worker image carries that script change, both consumers in the pod (the startup script and the Python SDK) read
DATAHUB_GMS_URLdirectly and need no other env vars. Exposinghost/port/protocolin the chart adds surface area without functional benefit — and, as we just discovered, can actively break things.What this PR does
templates/workload.yamlto the 0.0.41 env block (onlyDATAHUB_GMS_URLis emitted).0.0.42→0.0.43.Verified
helm templatewith default values → onlyDATAHUB_GMS_URLexported.helm templatewithgms.host=...override → still onlyDATAHUB_GMS_URLexported (the override is ignored).