Add WMATA Smart Data Hub open-source implementation#1
Open
chrisyamas wants to merge 1 commit into
Open
Conversation
…ata/sdh-open-source/
lauriemerrell
approved these changes
Apr 28, 2026
jlstpaul
approved these changes
May 11, 2026
jlstpaul
approved these changes
May 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Publishing WMATA's Smart Data Hub code as the first implementer entry in this
repository, contributed under
agencies/wmata/sdh-open-source/.What's included
The WMATA team's open-source release of the Smart Data Hub, a transit data
lakehouse that ingests data from GTFS, AVL and APC, fare gate, and open-loop
fare payment sources, transforms it into TIDES-compliant tables, and serves
it for analytics and reporting. The release includes:
pipelines/— Dagster pipelines for ingestion and orchestrationwarehouse/— dbt project with models, tests, macros, and seedstf/— Terraform/OpenTofu infrastructure-as-codedocs/— documentation and architecture diagramsscripts/— setup and deployment convenience scriptsIf you want to see TIDES in action,
warehouse/is the place to look.That's where the WMATA team's vendor-formatted source data (GTFS, AVL
and APC, fare gate, and open-loop fare payments) gets transformed into
TIDES tables, organized in dbt's layered structure: staging,
intermediate, mart, and metrics layers. Everything else in the
repository supports getting the data into the warehouse and running
the transformations reliably.
About this contribution
The contributed code is a redacted version of a production repository,
prepared by the WMATA team. Vendor names and sensitive details were
replaced with placeholders before the code was provided for publication.
Two additional placeholder substitutions were applied prior to this PR to
cover items surfaced in a pre-publication leakage review (a concrete Azure
Container Registry name in a debug-module example, and a user directory
segment in a pyproject.toml comment); both are noted inline in the files
where they appear.
Primary contact
Chum Chancharadeth, CChancharadeth@wmata.com, per the contributed
README.md.Review notes
Reviewers should focus on subfolder structure and README clarity. The
technical content of the implementation itself is the WMATA team's work
to publish as they see fit; substantive review of that content sits with
WMATA, not with the TIDES maintainers of this repository.