At around 6am today the last set of CI testing on a pull request finished and it auto-merged to master. At 7am, I rebased and pushed the next pull request and triggered its CI tests and set it to auto-merge. It is now 5pm and there are still 3 running and 9 queued (not even started). 5 failed because of temporary Conda http issues, and will need to be manually restarted, which cannot be done until the remaining tests complete. At this rate It's going to take two full days to merge a single pull request. And then the next one will need to be rebased.
This seems impractical, or at least undesirable.
A lot of time is spent waiting for the Mac runners. And even if they had many more runners to reduce wall-clock time it's still a bit wasteful of resources. Yesterday alone we used 3,569 min of linux runner and 6,986 min of MacOs runner (which would have cost $454.56 were they not offering us a 100% discount for being open source).
I wonder if we could slim down the full matrix of python and OS versions. eg. assume that if the tests pass WITH Julia and RMS then they will also pass without it? Or assume that as long as you can build it on an OS it will give the same results? Or assume that you don't need to test every version of Python on both Intel and M-series Macs?
Perhaps at least for the tests that are required for a PR to merge.
We could leave a more complete set of tests for the nightly builds, and/or the pushes to Main?
We can usually tell that something is failing quite quickly. But waiting for ALL the currently required tests to pass is slowing things down.
Thoughts?
At around 6am today the last set of CI testing on a pull request finished and it auto-merged to master. At 7am, I rebased and pushed the next pull request and triggered its CI tests and set it to auto-merge. It is now 5pm and there are still 3 running and 9 queued (not even started). 5 failed because of temporary Conda http issues, and will need to be manually restarted, which cannot be done until the remaining tests complete. At this rate It's going to take two full days to merge a single pull request. And then the next one will need to be rebased.
This seems impractical, or at least undesirable.
A lot of time is spent waiting for the Mac runners. And even if they had many more runners to reduce wall-clock time it's still a bit wasteful of resources. Yesterday alone we used 3,569 min of linux runner and 6,986 min of MacOs runner (which would have cost $454.56 were they not offering us a 100% discount for being open source).
I wonder if we could slim down the full matrix of python and OS versions. eg. assume that if the tests pass WITH Julia and RMS then they will also pass without it? Or assume that as long as you can build it on an OS it will give the same results? Or assume that you don't need to test every version of Python on both Intel and M-series Macs?
Perhaps at least for the tests that are required for a PR to merge.
We could leave a more complete set of tests for the nightly builds, and/or the pushes to Main?
We can usually tell that something is failing quite quickly. But waiting for ALL the currently required tests to pass is slowing things down.
Thoughts?