Skip to content

fix(worker): revert URL-parsing from 0.0.42; bump to 0.0.43#71

Merged
rajatoss merged 1 commit into
mainfrom
fix/conditional-gms-env-vars
May 19, 2026
Merged

fix(worker): revert URL-parsing from 0.0.42; bump to 0.0.43#71
rajatoss merged 1 commit into
mainfrom
fix/conditional-gms-env-vars

Conversation

@rajatoss
Copy link
Copy Markdown
Member

@rajatoss rajatoss commented May 19, 2026

Summary

Revert the URL-parsing logic introduced in #70 (chart 0.0.42). Chart 0.0.43 is functionally identical to 0.0.41 — only emits DATAHUB_GMS_URL. The proper fix for the original problem lives in the image, not the chart.

What 0.0.42 broke

PR #70 made the chart auto-derive and unconditionally export DATAHUB_GMS_HOST / PORT / PROTOCOL from gms.url. This broke the DataHub Python SDK in deployments with a path-prefixed URL.

The SDK's _get_config_from_env() has this branch:

if port is not None:
    url = f"{protocol}://{host}:{port}"   # <-- URL discarded, path lost
    return url, token

For gms.url: https://dev07.acryl.io/gms, the chart exported PORT=443, triggering this branch. The SDK reconstructed https://dev07.acryl.io:443 — losing the /gms path — which hit the frontend, not GMS.

Why we don't need the env vars at all

The original motivation for setting DATAHUB_GMS_HOST/PORT/PROTOCOL in chart 0.0.42 was to feed the image's wait_for_gms_ready startup readiness gate, which (in pr8856 / v1.1.0rc1) hardcoded datahub-gms as the host and failed DNS in namespaces with a Helm release prefix (e.g. dh-datahub-gms).

That bug has now been fixed at the image level:

Once the worker image carries that script change, both consumers in the pod (the startup script and the Python SDK) read DATAHUB_GMS_URL directly and need no other env vars. Exposing host/port/protocol in the chart adds surface area without functional benefit — and, as we just discovered, can actively break things.

What this PR does

  • Reverts templates/workload.yaml to the 0.0.41 env block (only DATAHUB_GMS_URL is emitted).
  • Bumps chart version 0.0.420.0.43.

Verified

helm template with default values → only DATAHUB_GMS_URL exported.
helm template with gms.host=... override → still only DATAHUB_GMS_URL exported (the override is ignored).

@rajatoss rajatoss self-assigned this May 19, 2026
PR #70 (chart 0.0.42) introduced URL-parsing that always exported
DATAHUB_GMS_HOST / PORT / PROTOCOL derived from gms.url. This broke
the DataHub Python SDK: `_get_config_from_env()` in
datahub.cli.config_utils short-circuits on `if port is not None` and
rebuilds the URL from HOST:PORT, dropping any path component. For
LB-routed URLs like `https://dev07.acryl.io/gms`, the reconstructed
URL became `https://dev07.acryl.io:443`, which hit the frontend and
failed ingestion with "you seem to have connected to the frontend
service instead of the GMS endpoint".

The HOST/PORT/PROTOCOL env vars were only ever a workaround for an
image-side bug (the startup script in pr8856 / v1.1.0rc1 hardcoded
"datahub-gms" instead of reading DATAHUB_GMS_URL). The proper fix
landed in acryldata/datahub-fork#9666 (now backported to
releases/v1.1.0 via #9668) — the startup script now reads
DATAHUB_GMS_URL directly. With that image-side fix in place, the
chart no longer needs to set HOST/PORT/PROTOCOL at all.

This chart change reverts to the 0.0.41 behavior (only emit
DATAHUB_GMS_URL) and bumps the version to 0.0.43. Customers who
deployed 0.0.42 with implicit URL-parsing get fixed ingestion on
upgrade. Customers who had set explicit gms.host/port/protocol in
0.0.42 (unlikely — the chart only published briefly) can use the
existing extraEnvs: escape hatch if they need those env vars.
@rajatoss rajatoss force-pushed the fix/conditional-gms-env-vars branch from 6c1211f to 07805d4 Compare May 19, 2026 07:49
@rajatoss rajatoss changed the title fix(worker): only export GMS host/port/protocol when explicitly set; bump to 0.0.43 chart: revert URL-parsing from 0.0.42; bump to 0.0.43 May 19, 2026
@rajatoss rajatoss changed the title chart: revert URL-parsing from 0.0.42; bump to 0.0.43 fix(worker): revert URL-parsing from 0.0.42; bump to 0.0.43 May 19, 2026
@rajatoss rajatoss requested a review from skrydal May 19, 2026 07:55
@rajatoss rajatoss merged commit 57f85b0 into main May 19, 2026
1 check passed
@rajatoss rajatoss deleted the fix/conditional-gms-env-vars branch May 19, 2026 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants