Summary
We introduced E2E CI for the Tinker API with the SkyRLTrainBackend in #1616 , but currently we don't have any assertions on the metric values:
|
# TODO: tighten thresholds after 3-5 nightly runs (5% allowance from min observed), |
|
# matching the convention in gsm8k_colocate.sh. |
|
REWARD_MIN_VALUE=0.0 |
We need to add assertions for expected metrics based on the results of 5 nightly runs
Summary
We introduced E2E CI for the Tinker API with the
SkyRLTrainBackendin #1616 , but currently we don't have any assertions on the metric values:SkyRL/tests/train/gpu_e2e_test/gsm8k_tinker.sh
Lines 11 to 13 in 29f0fdf
We need to add assertions for expected metrics based on the results of 5 nightly runs