CI: add FFM bringup workflow#3596
Conversation
Add a manual GitHub Actions path to build the FFM image, run the aiter GEMM smoke test, and optionally publish the validated image for bringup testing.
🏷️ CI GuideRuns automatically on every PR:
Extended tests (opt-in via labels):
|
There was a problem hiding this comment.
Pull request overview
This PR adds a manually triggered GitHub Actions workflow to bring up an FFM-based runtime, build the required Docker images (base + FFM), run an aiter GEMM smoke/unit test inside that runtime, and optionally push the resulting FFM image tags to Docker Hub.
Changes:
- Added a
workflow_dispatch“FFM Bringup” workflow that resolves an internal Artifactory FFM package, builds images, runs aiter tests, and optionally pushesrocm/ffm-devtags. - Added Dockerfiles for a minimal Ubuntu 24.04 base image and an FFM image that pulls FFM artifacts and builds Triton/aiter.
- Added helper scripts for FFM environment setup and optional LLVM+Triton build.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/ffm-bringup.yaml | Manual bringup workflow to build/test (and optionally push) FFM-based images. |
| .github/ffm/Dockerfile.base | Base Ubuntu image with Python, ROCm SDK deps, and build/test tooling. |
| .github/ffm/Dockerfile.ffm | FFM image build that downloads FFM package and builds Triton + aiter. |
| .github/ffm/ffmlite_env.sh | Runtime environment setup script sourced inside the container. |
| .github/ffm/build_llvm_triton.sh | Helper script to build custom LLVM and then build Triton against it. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #!/bin/bash | ||
| pkgroot="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | ||
|
|
||
| export ROCM_PATH=$(pip show torch | grep ^Location: | cut -d' ' -f2-)/_rocm_sdk_core |
| - name: Push FFM image | ||
| if: ${{ inputs.push_image }} | ||
| env: |
| mkdir -p /opt/rocm && \ | ||
| ln -sf ${ROCM_PATH}/lib /opt/rocm/lib | ||
|
|
||
| RUN pip install --no-cache-dir --upgrade hip-python -i https://test.pypi.org/simple/ |
🏷️ CI GuideRuns automatically on every PR:
Extended tests (opt-in via labels):
|
Run Ruff directly with GitHub annotation output so the pre-check no longer depends on downloading the latest reviewdog release at runtime.
This reverts commit bc5d9a5.
Allow the FFM bringup workflow to run from this PR by adding a path-scoped pull_request trigger and PR-safe defaults for the Jenkins parity test case.
Copy the current aiter checkout into the FFM image instead of cloning ROCm/aiter during docker build, removing the need for the ROCm GitHub SSH key.
Clone AMD-Lightning-Internal LLVM and AMD-Triton over HTTPS with GitHub App installation tokens passed as BuildKit secrets instead of requiring an AMD SSH private key.
Use the aiter-1gpu-runner label as the default FFM bringup runner while keeping the workflow_dispatch input override.
Summary
push_imageso test runs do not overwriterocm/ffm-dev:latestby default.Test plan
/dev/kfdand/dev/dri.