nspawn and the container child use eventfd to wait and notify each other that they are ready so the container setup can be completed.
However in its current form the wait/notify event ignore errors that may especially affect the child (container).
On errors the child will jump to the "child_fail" label and terminate with _exit(EXIT_FAILURE) without notifying the parent. Since the eventfd is created without the "EFD_NONBLOCK" flag, this leaves the parent blocking on the eventfd_read() call. The container can also be killed at any moment before execv() and the parent will not receive notifications.
We can fix this by using cheap mechanisms, the new high level eventfd API and handle SIGCHLD signals:
- Keep the cheap eventfd and EFD_NONBLOCK flag.
- Introduce eventfd states for parent and child to sync. Child notifies parent with EVENTFD_CHILD_SUCCEEDED on success or EVENTFD_CHILD_FAILED on failure and before _exit(). This prevents the parent from waiting on an event that will never come.
- If the child is killed before execv() or before notifying the parent, we install a NOP handler for SIGCHLD which will interrupt blocking calls with EINTR. This gives a chance to the parent to call wait() and terminate in main().
- If there are no errors, parent will block SIGCHLD, restore default handler and notify child which will do execv(), then parent will pass control to process_pty() to do its magic.
This was exposed in part by: https://bugs.freedesktop.org/show_bug.cgi?id=76193
e866af3 nspawn: make nspawn robust to container failure
Makefile.am | 4 +-
src/nspawn/nspawn.c | 92 +++++++++++++++++-------
src/shared/eventfd-util.c | 169 +++++++++++++++++++++++++++++++++++++++++++++
src/shared/eventfd-util.h | 43 ++++++++++++
4 files changed, 282 insertions(+), 26 deletions(-)