Ubuntu 24.04 runc issues with AppArmor
Mark Elvers
2 min read

Categories

  • Ubuntu,runc,AppArmor

Tags

  • tunbury.org

Patrick reported issues with OCaml-CI running tests on ocaml-ppx.

Fedora seems to be having some issues: https://ocaml.ci.dev/github/ocaml-ppx/ppxlib/commit/0d6886f5bcf22287a66511817e969965c888d2b7/variant/fedora-40-5.3_opam-2.3

sudo: PAM account management error: Authentication service cannot retrieve authentication info
sudo: a password is required
"/usr/bin/env" "bash" "-c" "sudo dnf install -y findutils" failed with exit status 1
2025-05-12 08:55.09: Job failed: Failed: Build failed

I took this problem at face value and replied that the issue would be related to Fedora 40, which is EOL. I created PR#1011 for OCaml-CI and deployed it. However, the problem didn’t go away. We were now testing Fedora 42, but jobs were still failing. I created a minimal obuilder job specification:

((from ocaml/opam:fedora-42-ocaml-4.14@sha256:475a852401de7d578efec2afce4384d87b505f5bc610dc56f6bde3b87ebb7664)
(user (uid 1000) (gid 1000))
(run (shell "sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam")))

Submitting the job to the cluster showed it worked on all machines except for bremusa.

$ ocluster-client submit-obuilder --connect mtelvers.cap  --pool linux-x86_64 --local-file fedora-42.spec
Tailing log:
Building on bremusa.ocamllabs.io

(from ocaml/opam:fedora-42-ocaml-4.14@sha256:475a852401de7d578efec2afce4384d87b505f5bc610dc56f6bde3b87ebb7664)
2025-05-12 16:55.42 ---> using "aefb7551cd0db7b5ebec7e244d5637aef02ab3f94c732650de7ad183465adaa0" from cache

/: (user (uid 1000) (gid 1000))

/: (run (shell "sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam"))
sudo: PAM account management error: Authentication service cannot retrieve authentication info
sudo: a password is required
"/usr/bin/env" "bash" "-c" "sudo ln -f /usr/bin/opam-2.3 /usr/bin/opam" failed with exit status 1
Failed: Build failed.

Changing the image to opam:debian-12-ocaml-4.14 worked, so the issue only affects Fedora images and only on bremusa. I was able to reproduce the issue directly using runc.

# runc run test
sudo: PAM account management error: Authentication service cannot retrieve authentication info
sudo: a password is required

Running ls -l /etc/shadow in the container showed that the permissions on /etc/shadow are 000. If these are changed to 640, then sudo works correctly. Permissions are set 000 for /etc/shadow in some distributions as access is limited to processes with the capability DAC_OVERRIDE.

Having seen a permission issue with runc and libseccomp compatibility before, I went down a rabbit hole investigating that. Ultimately, I compiled runc without libseccomp support, make MAKETAGS="", and this still had the same issue.

All the machines in the linux-x86_64 pool are running Ubuntu 22.04 except for bremusa. I configured a spare machine with Ubuntu 24.04 and tested. The problem appeared on this machine as well.

Is there a change in Ubuntu 24.04?

I temporarily disabled AppArmor by editing /etc/default/grub and added apparmor=0 to GRUB_CMDLINE_LINUX, ran update-grub and rebooted. Disabling AppArmor entirely like this can create security vulnerabilities, so this isn’t recommended, but it did clear the issue.

After enabling AppArmor again, I disabled the configuration for runc by running:

ln -s /etc/apparmor.d/runc /etc/apparmor.d/disable/
apparmor_parser -R /etc/apparmor.d/runc

This didn’t help - in fact, this was worse as now runc couldn’t run at all. I restored the configuration and added capability dac_override, but this didn’t help either.

Looking through the profiles with grep shadow -r /etc/apparmor.d, I noticed unix-chkpwd, which could be the source of the issue. I disabled this profile and the issue was resolved.

ln -s /etc/apparmor.d/unix-chkpwd /etc/apparmor.d/disable/
apparmor_parser -R /etc/apparmor.d/unix-chkpwd

Armed with the answer, it’s pretty easy to find other people with related issues:

  • https://github.com/docker/build-push-action/issues/1302
  • https://github.com/moby/moby/issues/48734