Skip to content

improve startup reliability#1

Open
spoonincode wants to merge 1 commit intomainfrom
startup_reliablility
Open

improve startup reliability#1
spoonincode wants to merge 1 commit intomainfrom
startup_reliablility

Conversation

@spoonincode
Copy link
Contributor

  • systemd units for network & runner really should be "oneshot" (I'm not a little uncertain how it worked previously with "simple")
  • wait for network interface to indicate "up" before using the network
  • simplify selection of runner release to use (just use "latest")
  • don't pipe the runner download directly to tar: hopefully it can be retried more reliably that way

* systemd units for network & runner really should be "oneshot"
* wait for network interface to indicate "up" before using the network
* simplify selection of runner release to use (just use "latest")
* don't pipe the runner download directly to tar: hopefully it can be retried more reliably that way
Comment on lines +10 to +12
RUNNER_RELEASE_URL=$(${CURL} https://api.github.com/repos/actions/runner/releases/latest | \
jq -e -r '.assets[] | if .name | test("actions-runner-linux-x64-[0-9.]+.tar") then .browser_download_url else empty end')
${CURL} -O "${RUNNER_RELEASE_URL}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jq -e flag is a new trick for me, nice.

Should we quote these variable assignments to protect against BASH code injection?

RUNNER_RELEASE_URL=$(...)

...would become:

RUNNER_RELEASE_URL="$(...)"

As a point of interest, I believe the jq query if-then snippet...

if .name | test("actions-runner-linux-x64-[0-9.]+.tar") then .browser_download_url else empty end

...could be simplified to this.

.name | test("actions-runner-linux-x64-[0-9.]+.tar") | .browser_download_url // empty

I haven't tried it, not certain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we quote these variable assignments to protect against BASH code injection

Yeah, probably prudent to do so just for best practices purposes.

Not sure if I exactly understand your suggestion. It's easy enough to try -- no token needed or anything. If you find a more concise way I'd certainly be interested.

$ curl -s https://api.github.com/repos/actions/runner/releases/latest | jq -e -r '.assets[] | .name | test("actions-runner-linux-x64-[0-9.]+.tar") | .browser_download_url'
jq: error (at <stdin>:1242): Cannot index boolean with string "browser_download_url"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind on the jq thing, I see the if...else block behaves slightly different than I had thought when I wrote that comment. My mistake.

Comment on lines +22 to +27
tries=100
until eval '(( $(< /sys/class/net/eth0/carrier) ))'; do
(( tries-- > 0 )) || exit 1
sleep 0.1
done

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop only verifies that an Ethernet cable is connected...not that the network interface is actually functional. Here are two suggestions from ChatGPT to check that a network interface is functional in a BASH script, both of which I tried in my terminal.

The first starts a service dependent on network-online.target from the NetworkManager-wait-online.service. It waits until this target is reached, then returns.

systemd-run --wait --quiet --service-type=oneshot --property="After=network-online.target" /bin/true

This second example is simpler, it reads a file like the one that you have.

cat /sys/class/net/eth0/operstate

We would want this command to return up.

From ChatGPT:

/sys/class/net/eth0/operstate is a sysfs file that shows the operational state of the network interface eth0. The possible values for this file are unknown, notpresent, down, lowerlayerdown, testing, dormant, up. The operstate file reflects the current state of the network interface based on various factors, including the physical state of the interface, the state of the underlying protocol stack, and the administrative state of the interface.

/sys/class/net/eth0/carrier is another sysfs file that indicates the physical state of the network interface eth0. This file shows whether the link is up or down and is used to determine if the network cable is connected to the interface. The possible values for this file are 0 and 1, where 0 indicates that the link is down and 1 indicates that the link is up.

In summary, operstate reflects the overall state of the network interface, while carrier indicates the physical state of the link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants