-
Notifications
You must be signed in to change notification settings - Fork 345
Open
Labels
cml-runnerSubcommandSubcommanddiscussionWaiting for team decisionWaiting for team decisionepicCollection of sub-issuesCollection of sub-issues
Description
Several Issues could help be solved by better management of the GHA process.
Not necessarily solving but the following could be helped by this solution.
- Losing network for a while can endup with the runner running forever (GH at least) #1014 @DavidGOrtega?
- No space left on device creates hung EC2 instance #1006 @dacbd
- OOM reaping / crashes @dacbd
- Oddities in runner log parsing/detecting events #1037
For cloud runners, launch the GHA client as a systemd unit to better control the process / separate it from the cml
process. Then hook into logs for monitoring/triggering shutdown events etc.
0x2b3bfa0, casperdcl and DavidGOrtega
Metadata
Metadata
Assignees
Labels
cml-runnerSubcommandSubcommanddiscussionWaiting for team decisionWaiting for team decisionepicCollection of sub-issuesCollection of sub-issues