Page MenuHomeSolus

Segmentation Fault when running 'clr-boot-manager update'
Closed, ResolvedPublic

Assigned To
Authored By
zJelly
Oct 14 2017, 7:19 PM
Referenced Files
F257211: Screenshot from 2017-10-18 14-45-12.png
Oct 18 2017, 1:46 PM
F257215: Screenshot from 2017-10-18 14-46-00.png
Oct 18 2017, 1:46 PM
Tokens
"Love" token, awarded by kyentei."Like" token, awarded by Chuck."Like" token, awarded by Justin."100" token, awarded by sharms.

Description

Hello, following the repo sync earlier today (after the first, before the second one) I am no longer able to run clr-boot-manager update

root@solus # clr-boot-manager update
Segmentation fault (core dumped)

Journalctl's reporting of the problem:

root@solus # journalctl -r --since "1 minute ago"
-- Logs begin at Mon 2017-09-04 20:50:15 BST, end at Sat 2017-10-14 20:11:20 BST. --
Oct 14 20:11:20 solus-feb2017 kernel: audit: type=1701 audit(1508008280.375:88): auid=1000 uid=0 gid=0 ses=2 pid=9320 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11 res=1
Oct 14 20:11:20 solus-feb2017 kernel: clr-boot-manage[9320]: segfault at 10 ip 00007f505de2d1b1 sp 00007ffd3800e3d0 error 4 in libblkid.so.1.1.0[7f505de15000+44000]
Oct 14 20:11:20 solus-feb2017 audit[9320]: ANOM_ABEND auid=1000 uid=0 gid=0 ses=2 pid=9320 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11 res=1

apparently the libblkid.so file it refers to is a part of the util-linux package, which was also updated in the repo sync

* util-linux is upgraded from 2.28-17-1-x86_64 to 2.30.2-18-1-x86_64.

Revisions and Commits

Event Timeline

A sneaky workaround which I wouldn't rely on forever, but it works for now:

Download the old 2.28 version of util-linux from the shannon repo: https://packages.solus-project.com/shannon/u/util-linux/util-linux-2.28-17-1-x86_64.eopkg
Open it in archive manager, browse to install.tar.xz->usr->lib64
Extract libblkid.so.1.1.0 somewhere
Backup the current library somewhere safe: sudo mv /usr/lib64/libblkid.so.1.1.0 /usr/lib64/libblkidso.1.1.0.disabled
Copy in the library you just extracted: sudo cp /path/to/extracted/libblkid.so.1.1.0 /usr/lib64/
Update your boot menu: sudo clr-boot-manager update
Restore the library from backup: sudo mv /usr/lib64/libblkid.so.1.1.0.disabled /usr/lib64/libblkid.so.1.1.0

This is for a one-time cbm run, I expect the devs will fix the packages soon.
If something goes awfully wrong, run sudo eopkg install --reinstall util-linux to get everything back where it should be.

Stupid question: did you try to execute clr-boot-manager a 2nd time before you replaced the library ?

Or better yet

sudo CBM_DEBUG=1 clr-boot-manager update

I tried it about five times before touching util-linux, including once on the linux-lts kernel, with a segfault every time. I only looked at the journalctl for two of them, but both were libblkid.so related errors.

My first util-linux attempt was to completely overwrite the package with the old version, which worked too, but clr-boot-manager complained it couldn't delete some module and kernel files. The second attempt used the steps listed above, and worked without complaining.

My clr-boot-manager still segfaults even on the latest kernel, with the same error 4 in libblkid.so.1.1.0 message.

the output with CBM_DEBUG=1:

root@solus # CBM_DEBUG=1 clr-boot-manager update
[INFO] cbm (src/bootman/bootman.c:L437): Current running kernel: 4.13.7-27.current
Segmentation fault (core dumped)
root@solus # journalctl -r --since "10 seconds ago"
-- Logs begin at Mon 2017-09-04 20:50:15 BST, end at Sat 2017-10-14 22:00:07 BST. --
Oct 14 22:00:07 solus-feb2017 kernel: audit: type=1701 audit(1508014807.402:87): auid=1000 uid=0 gid=0 ses=2 pid=9967 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11 res=1
Oct 14 22:00:07 solus-feb2017 kernel: clr-boot-manage[9967]: segfault at 10 ip 00007f6b62a681b1 sp 00007ffe318d07e0 error 4 in libblkid.so.1.1.0[7f6b62a50000+44000]
Oct 14 22:00:07 solus-feb2017 audit[9967]: ANOM_ABEND auid=1000 uid=0 gid=0 ses=2 pid=9967 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11 res=1

I experience this too after latest updates.

oft  ~  139  sudo CBM_DEBUG=1 clr-boot-manager update
Password: 
[INFO] cbm (src/bootman/bootman.c:L437): Current running kernel: 4.13.5-24.current
[INFO] cbm (src/bootman/sysconfig.c:L98): Discovered UEFI ESP: /dev/disk/by-partuuid/42fb38c4-2a59-4f66-855e-1260f1a50e11
[INFO] cbm (src/bootman/sysconfig.c:L123): Fully resolved boot device: /dev/sda1
Segmentation fault
 oft  ~  sudo journalctl -r --since "10 seconds ago"
Password: 
-- Logs begin at Thu 2017-09-14 18:27:10 EEST, end at Sun 2017-10-15 11:56:42 EEST. --
Oct 15 11:56:42 kobol sudo[2715]: pam_systemd(sudo:session): Cannot create session: Already occupied by a session
Oct 15 11:56:42 kobol sudo[2715]: pam_unix(sudo:session): session opened for user root by (uid=0)
Oct 15 11:56:42 kobol sudo[2715]:      oft : TTY=pts/1 ; PWD=/home/otto ; USER=root ; COMMAND=/usr/bin/journalctl -r --since 10 seconds ago
Oct 15 11:56:35 kobol kernel: audit: type=1701 audit(1508057795.726:56): auid=1000 uid=0 gid=0 ses=2 pid=2689 comm="sudo" exe="/usr/bin/sudo" sig=11 res=1
Oct 15 11:56:35 kobol audit[2689]: ANOM_ABEND auid=1000 uid=0 gid=0 ses=2 pid=2689 comm="sudo" exe="/usr/bin/sudo" sig=11 res=1
Oct 15 11:56:35 kobol sudo[2689]: pam_unix(sudo:session): session closed for user root
Oct 15 11:56:35 kobol kernel: audit: type=1701 audit(1508057795.723:55): auid=1000 uid=0 gid=0 ses=2 pid=2690 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11 res=1
Oct 15 11:56:35 kobol kernel: clr-boot-manage[2690]: segfault at 10 ip 00007f83c519f1b1 sp 00007ffd497c4580 error 4 in libblkid.so.1.1.0[7f83c5187000+44000]
Oct 15 11:56:35 kobol audit[2690]: ANOM_ABEND auid=1000 uid=0 gid=0 ses=2 pid=2690 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11 res=1
Oct 15 11:56:35 kobol sudo[2689]: pam_systemd(sudo:session): Cannot create session: Already occupied by a session
Oct 15 11:56:35 kobol sudo[2689]: pam_unix(sudo:session): session opened for user root by (uid=0)
Oct 15 11:56:35 kobol sudo[2689]:      oft : TTY=pts/0 ; PWD=/home/otto ; USER=root ; ENV=CBM_DEBUG=1 ; COMMAND=/usr/bin/clr-boot-manager update
Justin triaged this task as High priority.Oct 15 2017, 9:01 AM
Justin added a subscriber: Justin.

Also getting the same here.

This comment was removed by yaymalaga.
joebonrichie raised the priority of this task from High to Unbreak Now!.Oct 15 2017, 2:00 PM
joebonrichie added a subscriber: joebonrichie.

Desktop installation is borked, need to go out so I can't fix it right now. However, I recommend not rebooting / or not installing kernel updates right now

I need usable information to help me here. Please install util-linux-dbginfo glibc-dbginfo clr-boot-manager-dbginfo and run it via gdb so I have a stack trace,
as I'm unable to reproduce.

(gdb) run update
Starting program: /usr/bin/clr-boot-manager update

Program received signal SIGSEGV, Segmentation fault.
blkid_partlist_numof_partitions (ls=0x0)
    at libblkid/src/partitions/partitions.c:915
915	libblkid/src/partitions/partitions.c: No such file or directory.
(gdb) backtrace
#0  blkid_partlist_numof_partitions (ls=0x0)
    at libblkid/src/partitions/partitions.c:915
#1  0x0000000000409887 in cbm_probe_is_gpt (path=0x41a9a0 "\003")
    at src/lib/probe.c:164
#2  cbm_probe_path (path=path@entry=0x417860 "/") at src/lib/probe.c:245
#3  0x0000000000405f38 in cbm_inspect_root (path=path@entry=0x4109c7 "/", 
    image_mode=<optimized out>) at src/bootman/sysconfig.c:128
#4  0x0000000000403596 in boot_manager_set_prefix (self=self@entry=0x417800, 
    prefix=prefix@entry=0x4109c7 "/") at src/bootman/bootman.c:153
#5  0x0000000000402c63 in cbm_command_update (argc=<optimized out>, 
    argv=<optimized out>) at src/cli/ops/update.c:62
#6  0x0000000000402786 in main (argc=0, argv=0x7fffffffe628)
    at src/cli/main.c:242

Not getting exactly the same as the above

(gdb) run update
Starting program: /usr/bin/clr-boot-manager update

Program received signal SIGSEGV, Segmentation fault.
blkid_partlist_numof_partitions (ls=0x0)
    at libblkid/src/partitions/partitions.c:915
915	libblkid/src/partitions/partitions.c: No such file or directory.

(gdb) backtrace
#0  blkid_partlist_numof_partitions (ls=0x0)
    at libblkid/src/partitions/partitions.c:915
#1  0x000000000040811b in get_legacy_boot_device (path=path@entry=0x417860 "/")
    at src/lib/files.c:218
#2  0x0000000000406005 in cbm_inspect_root (path=path@entry=0x4109c7 "/", 
    image_mode=<optimized out>) at src/bootman/sysconfig.c:81
#3  0x0000000000403596 in boot_manager_set_prefix (self=self@entry=0x417800, 
    prefix=prefix@entry=0x4109c7 "/") at src/bootman/bootman.c:153
#4  0x0000000000402c63 in cbm_command_update (argc=<optimized out>, 
    argv=<optimized out>) at src/cli/ops/update.c:62
#5  0x0000000000402786 in main (argc=0, argv=0x7fffffffec38)
    at src/cli/main.c:242
(gdb) run update
Starting program: /usr/bin/clr-boot-manager update

Program received signal SIGSEGV, Segmentation fault.
blkid_partlist_numof_partitions (ls=0x0) at libblkid/src/partitions/partitions.c:915
915	libblkid/src/partitions/partitions.c: No such file or directory.
(gdb) backtrace
#0  blkid_partlist_numof_partitions (ls=0x0) at libblkid/src/partitions/partitions.c:915
#1  0x000000000040811b in get_legacy_boot_device (path=path@entry=0x417860 "/") at src/lib/files.c:218
#2  0x0000000000406005 in cbm_inspect_root (path=path@entry=0x4109c7 "/", image_mode=<optimized out>)
    at src/bootman/sysconfig.c:81
#3  0x0000000000403596 in boot_manager_set_prefix (self=self@entry=0x417800, prefix=prefix@entry=0x4109c7 "/")
    at src/bootman/bootman.c:153
#4  0x0000000000402c63 in cbm_command_update (argc=<optimized out>, argv=<optimized out>) at src/cli/ops/update.c:62
#5  0x0000000000402786 in main (argc=0, argv=0x7fffffffe028) at src/cli/main.c:242

What's the commonality between your systems? Are you all on BIOS, UEFI? Encrypted, nay..?

I am on UEFI + Full Disk Encryption. My laptop is Thinkpad 13 (first generation, Skylake).

It would appear you're all hitting the same segfault in blkid_partlist_numof_partitions though..

I'm my case I have Bios and full disk encryption, made by Solus' installer

OK so thats 2 for FDE - everyone else FDE?

I have experienced this problem as well several times today, when I did the installation using the LVM partitioning system, without encryption

Eventually, I decided to reinstall a fifth time today, but disabled the LVM creation, and now my kernel succesfully updated, so probably it looks like a problem with LVM created partitions ;)

In neither case, I had encryption enabled, so I cannot verify that

OK that's also curious - FDE relies on LVM btw.

Are you all on current kernel series or lts ?

@ikey I experienced both with the 4.9.22-17.lts (an older installation source I had) as with the 4.12.7-11.current using the latest ISO I could download

(btw I checked LVM but not FDE when it was not working, now it is working without any of those options checked during the installation)

No FDE for me, UEFI, running current

I'm using just current (did clean install when Solus 3 was released and I chose in the installer to use all disk)

Before 4.13.6 I had installed any kernel update without any problem (same clean install)

I'm on FDE, but I don't think I'm using UEFI. My kernel is linux-current, but the segfault also occured on -lts yesterday.

Downgrading util-linux in the ISO makes the installer work again, but it looks like CBM somehow still died there... Unhelpful.

Yeah no FDE here, just LVM whole disk.

Brand new install from Solus 3 iso, once I upgrade I get the same thing. UEFI + Full Disk Encryption.

Kernel:

Linux w541 4.12.7-11.current #1 SMP Sun Aug 13 11:33:35 UTC 2017 x86_64 GNU/Linux

Debug output:

root@w541 ~ # sudo CBM_DEBUG=1 clr-boot-manager update
[INFO] cbm (src/bootman/bootman.c:L437): Current running kernel: 4.12.7-11.current
[INFO] cbm (src/bootman/sysconfig.c:L98): Discovered UEFI ESP: /dev/disk/by-partuuid/80af3b4a-5b9d-41a9-a445-174f8dd893dc
[INFO] cbm (src/bootman/sysconfig.c:L123): Fully resolved boot device: /dev/sda1
Segmentation fault

Steps taken (after each step a full reinstall was performed, then upgraded):

  1. Tried installs in both UEFI and Legacy BIOS modes
  2. Tried installs with LUKS / LVM crypt and without
  3. Tried zeroing partition tables via
dd if=/dev/zero of=/dev/sda bs=1M count=500
  1. Cleared out all previous UEFI entries via
sudo rm -f '/sys/firmware/efi/efivars/'* || sync

GDB Backtrace:

(gdb) run update
Starting program: /usr/bin/clr-boot-manager update

Program received signal SIGSEGV, Segmentation fault.
blkid_partlist_numof_partitions (ls=0x0) at libblkid/src/partitions/partitions.c:915
915	libblkid/src/partitions/partitions.c: No such file or directory.
(gdb) bt
#0  blkid_partlist_numof_partitions (ls=0x0) at libblkid/src/partitions/partitions.c:915
#1  0x0000000000409887 in cbm_probe_is_gpt (path=0x41a9a0 "\003") at src/lib/probe.c:164
#2  cbm_probe_path (path=path@entry=0x417860 "/") at src/lib/probe.c:245
#3  0x0000000000405f38 in cbm_inspect_root (path=path@entry=0x4109c7 "/", image_mode=<optimized out>)
    at src/bootman/sysconfig.c:128
#4  0x0000000000403596 in boot_manager_set_prefix (self=self@entry=0x417800, prefix=prefix@entry=0x4109c7 "/")
    at src/bootman/bootman.c:153
#5  0x0000000000402c63 in cbm_command_update (argc=<optimized out>, argv=<optimized out>) at src/cli/ops/update.c:62
#6  0x0000000000402786 in main (argc=0, argv=0x7fffffffe778) at src/cli/main.c:242

blkid output:

/dev/sda1: UUID="0128-B9FF" TYPE="vfat" PARTUUID="80af3b4a-5b9d-41a9-a445-174f8dd893dc"
/dev/sda2: UUID="d95a3b9d-79c9-4b29-b635-9e4ba9cc88ed" TYPE="crypto_LUKS" PARTUUID="a7635715-b96d-4cab-a64c-e3412ecd07c5"
/dev/mapper/luks-d95a3b9d-79c9-4b29-b635-9e4ba9cc88ed: UUID="IHSQ27-mJzR-60n9-ZcOT-is3I-SGFX-mAcmlB" TYPE="LVM2_member"
/dev/mapper/SolusSystem-Swap: UUID="8872c805-46ac-4b20-9073-b435c7bd6328" TYPE="swap"
/dev/mapper/SolusSystem-Root: UUID="7da142e7-1120-45b2-8f54-b85cd119c54a" TYPE="ext4"

FDE here, experiencing this as well on LTS. UEFI w/ Legacy (grub) boot. I'm also not getting the new kernel option in Grub, trying to sort that out as well.

-- Logs begin at Sun 2017-07-02 05:46:14 PDT, end at Mon 2017-10-16 08:55:40 PDT. --
Oct 16 08:55:40 solus-thinkpad sudo[2886]: pam_systemd(sudo:session): Cannot create session: Already occupied by a session
Oct 16 08:55:40 solus-thinkpad sudo[2886]: pam_unix(sudo:session): session opened for user root by (uid=0)
Oct 16 08:55:40 solus-thinkpad sudo[2886]: mcritchlow : TTY=pts/0 ; PWD=/etc/kernel ; USER=root ; COMMAND=/usr/bin/journalctl -r --since 10 seconds ago
Oct 16 08:55:38 solus-thinkpad kernel: audit: type=1701 audit(1508169338.580:142): auid=1000 uid=0 gid=0 ses=2 pid=2882 comm="sudo" exe="/usr/bin/sudo" sig=11
Oct 16 08:55:38 solus-thinkpad audit[2882]: ANOM_ABEND auid=1000 uid=0 gid=0 ses=2 pid=2882 comm="sudo" exe="/usr/bin/sudo" sig=11
Oct 16 08:55:38 solus-thinkpad sudo[2882]: pam_unix(sudo:session): session closed for user root
Oct 16 08:55:38 solus-thinkpad kernel: audit: type=1701 audit(1508169338.577:141): auid=1000 uid=0 gid=0 ses=2 pid=2883 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11
Oct 16 08:55:38 solus-thinkpad kernel: clr-boot-manage[2883]: segfault at 10 ip 00007effac5f71b1 sp 00007ffd1fa5eee0 error 4 in libblkid.so.1.1.0[7effac5df000+44000]
Oct 16 08:55:38 solus-thinkpad audit[2883]: ANOM_ABEND auid=1000 uid=0 gid=0 ses=2 pid=2883 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11
Oct 16 08:55:38 solus-thinkpad sudo[2882]: pam_systemd(sudo:session): Cannot create session: Already occupied by a session
Oct 16 08:55:38 solus-thinkpad sudo[2882]: pam_unix(sudo:session): session opened for user root by (uid=0)
Oct 16 08:55:38 solus-thinkpad sudo[2882]: mcritchlow : TTY=pts/0 ; PWD=/etc/kernel ; USER=root ; ENV=CBM_DEBUG=1 ; COMMAND=/usr/bin/clr-boot-manager update
Oct 16 08:55:33 solus-thinkpad budgie-wm.desktop[1331]: Window manager warning: Buggy client sent a _NET_ACTIVE_WINDOW message with a timestamp of 0 for 0x3a00064 (test.out ()
Oct 16 08:55:31 solus-thinkpad kernel: audit: type=1701 audit(1508169331.499:140): auid=1000 uid=0 gid=0 ses=2 pid=2870 comm="sudo" exe="/usr/bin/sudo" sig=11
Oct 16 08:55:31 solus-thinkpad audit[2870]: ANOM_ABEND auid=1000 uid=0 gid=0 ses=2 pid=2870 comm="sudo" exe="/usr/bin/sudo" sig=11
Oct 16 08:55:31 solus-thinkpad sudo[2870]: pam_unix(sudo:session): session closed for user root
Oct 16 08:55:31 solus-thinkpad kernel: audit: type=1701 audit(1508169331.496:139): auid=1000 uid=0 gid=0 ses=2 pid=2871 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11
Oct 16 08:55:31 solus-thinkpad kernel: clr-boot-manage[2871]: segfault at 10 ip 00007f9d885b51b1 sp 00007fff403721a0 error 4 in libblkid.so.1.1.0[7f9d8859d000+44000]
Oct 16 08:55:31 solus-thinkpad audit[2871]: ANOM_ABEND auid=1000 uid=0 gid=0 ses=2 pid=2871 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11
Oct 16 08:55:31 solus-thinkpad sudo[2870]: pam_systemd(sudo:session): Cannot create session: Already occupied by a session
Oct 16 08:55:31 solus-thinkpad sudo[2870]: pam_unix(sudo:session): session opened for user root by (uid=0)
Oct 16 08:55:31 solus-thinkpad sudo[2870]: mcritchlow : TTY=pts/0 ; PWD=/etc/kernel ; USER=root ; ENV=CBM_DEBUG=1 ; COMMAND=/usr/bin/clr-boot-manager update

FDE and UEFI, affects me too.

Corresponding dmesg output:

[  231.121430] clr-boot-manage[2170]: segfault at 10 ip 00007feb349781b1 sp 00007ffe6cf78a30 error 4 in libblkid.so.1.1.0[7feb34960000+44000]
[  231.121463] audit: type=1701 audit(1508253323.990:59): auid=1000 uid=0 gid=0 ses=2 pid=2170 comm="clr-boot-manage" exe="/usr/bin/clr-boot-manager" sig=11 res=1
[  231.127236] audit: type=1701 audit(1508253323.996:60): auid=1000 uid=0 gid=0 ses=2 pid=2169 comm="sudo" exe="/usr/bin/sudo" sig=11 res=1

The common thread here is LVM - tryna figure out whats going on.

Can repro this in a VM from Solus 3 Budgie, kernel isn't getting promoted on LVM

ikey changed the task status from Open to In Progress.Oct 18 2017, 1:32 PM
ikey claimed this task.

Very interesting, even get-timeout causes a segfault.

OK so the problem is cbm_blkid_probe_get_partitions is actually returning NULL..

Gotcha ya slippery bastard

Screenshot from 2017-10-18 14-45-12.png (1×1 px, 1 MB)

Screenshot from 2017-10-18 14-46-00.png (1×1 px, 2 MB)

Merged upstream, awaiting 1.5.5 release (but of course github broke too)

Silly question, but how does one effect the new changes put up on github?

Due to arrive on Fridays sync, which will also be accompanied with the latest kernel builds to automatically restore kernel
consistency.

If you want/need this right now:

sudo eopkg it --ignore-dependency https://packages.solus-project.com/unstable/c/clr-boot-manager/clr-boot-manager-1.5.5-17-1-x86_64.eopkg
sudo clr-boot-manager update

In T4763#85572, @ikey wrote:

Due to arrive on Fridays sync, which will also be accompanied with the latest kernel builds to automatically restore kernel
consistency.

If you want/need this right now:

sudo eopkg it --ignore-dependency https://packages.solus-project.com/unstable/c/clr-boot-manager/clr-boot-manager-1.5.5-17-1-x86_64.eopkg
sudo clr-boot-manager update

Thanks, this worked perfectly for me.

Apologies for the issue, honestly. We're gonna need to build some test harnesses around this stuff..

Thanks again @ikey ! Outstanding work!