Page MenuHomeSolus

Boot fails with black screen and flashing cursor, every other boot
Closed, ResolvedPublic

Description

*******Workaround*******
Remove /var/log/journal and system boots fine - See Dec 30 posting


Symptom:

Upon powering up, choosing default grub menu option (first item), system appears to boot, but stalls. A black screen with the cursor flashing in the upper left side is left on the screen, which never clears until forcing shutdown with power button.

After powering back up, system boots normally. This behavior appears consistent, and agrees with other social media posts. See reddit/r/solusproject.

Summary: boot hangs every other attempt.

Output of

journalctl -b -p 3

shows this:

Logs begin at Sat 2017-10-14 10:41:35 CDT, end at Sun 2017-11-05 10:51:12 CST. --
Nov 05 10:35:24 solus kernel: Error parsing PCC subspaces from PCCT
Nov 05 10:35:24 solus kernel: ACPI Error: Needed type [Reference], found [Integer] ffff88040bddcd38 (20170531/exresop-103)
Nov 05 10:35:24 solus kernel: ACPI Exception: AE_AML_OPERAND_TYPE, While resolving operands for [OpcodeName unavailable] (20170531/dswexec-461)
Nov 05 10:35:24 solus kernel: ACPI Error: Method parse/execution failed \_PR.CPU0._PDC, AE_AML_OPERAND_TYPE (20170531/psparse-550)
Nov 05 10:35:38 solus1 kernel: tpm tpm0: A TPM error (6) occurred attempting to read a pcr value
Nov 05 10:35:55 solus1 lightdm[977]: gkr-pam: couldn't unlock the login keyring.
Nov 05 10:35:57 solus1 pulseaudio[1175]: [pulseaudio] pid.c: Daemon already running.

Note: journalctl does not preserve log from failing boots.

No other problems encountered.

System: Solus Budgie fully updated as of this date.
Hardware: Lenovo Thinkpad T440p Intel graphics

Event Timeline

Got this too with AMD APU A10-5750M.

Justin triaged this task as High priority.Nov 6 2017, 11:17 PM
Justin edited projects, added Hardware; removed Lacks Project.

For me the problem is that i have radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1 to my kernel parameters. This forces AMDGPU which don't work with my gpu, If I remove that it boots fine. However if I run sudo clr-boot-manager they get back there. I have looked in /etc/kernel/cmdline.d and there are no settings for this there. Running an EFI system. Anyone got any idea here?

Appears that the boot hang has been fixed, maybe by the recent GTK upgrade. Four back to back power off power on cycles has failed to make the system hang again. Will report back if that changes.

It's not fixed for me.

Still an issue for me too :(

On November 9th, the problem "went away". But, on November 27th, it's back.

My system: Lenovo Thinkpad with Integrated graphics, Solus Budgie fully updated. "journalctl -b -1 -p 3" reports no error, because I think the log is not written yet on the failing boots. System boots fine when it does.

What's the hardware commonality here? Also are you all on UEFI or BIOS?

UEFI here.
Laptop with HD530 and GeForce GTX 960M.

Unknown Object (User) added a subscriber: Unknown Object (User).Dec 8 2017, 6:35 PM

UEFI here
Intel HD Graphics

Someone on reddit said booting with vga=current helped them, does it help you guys..?

Gabochuky added a subscriber: Gabochuky.EditedDec 8 2017, 7:27 PM

I also have this problem but it is very inconsistent, maybe 1 out of 10 times my pc hangs. I have a Toshiba laptop with an AMD A10 apu

lebjortvedt added a comment.EditedDec 8 2017, 7:40 PM
In T4964#93919, @ikey wrote:

Someone on reddit said booting with vga=current helped them, does it help you guys..?

I tried adding it to the end of the line —> Black screen
Added it to the beginning of the line —> Black screen
Added it right before «resume» —> 2 successfull boots, then 2x Black screen.

Seems so annoyingly random ?

stigarn added a subscriber: sunnyflunk.EditedDec 9 2017, 6:52 AM
In T4964#93900, @ikey wrote:

What's the hardware commonality here? Also are you all on UEFI or BIOS?

I'm using an AMD APU A10-5750M and for me this problem is because of this kernel line: radeon.si_support=0 radeon.cik_support=0 amdgpu.si_support=1 amdgpu.cik_support=1
@sunnyflunk provided a solution in this bug report https://dev.solus-project.com/T4992
And i'm using UEFI.

The following amd-related options were enabled at my system, and I set them to:
amdgpu.si_support = 0
amdgpu_cik_support = 0

Out of 5 boots I had 3 successfull boots and 2 black screens.

I changed the settings at boot time as I don’t know how to do it permanently.

Yeah those won't affect it, you're still getting blackscreens. Actually think this might be a watchdog issue.

What's a "watchdog issue"?

bwat47 added a subscriber: bwat47.EditedDec 17 2017, 11:39 PM

I've run into this on one of my machines (an ivybridge intel NUC)

I've never run into it once on my laptop (dell xps 13, kaby lake)

Main difference I can think of are that the NUC has a slow 5400 rpm hdd where the laptop has an ssd

both are UEFI boot

NUC is running mate editon, laptop is running budgie

waddon1 added a subscriber: waddon1.EditedDec 18 2017, 12:30 PM

I've also encountered this a few times on my laptop
Intel
UEFI Boot
Happened on Budgie, Gnome and Mate

I believe this is the same issue as https://dev.solus-project.com/T2559, and someone there (as well as on Reddit) say that the issues disappeared after adding a new boot parameter.
I tried adding it myself and updating clr-boot-manager but it still occurs :(

solus-steve updated the task description. (Show Details)Dec 25 2017, 7:29 AM
solus-steve updated the task description. (Show Details)

It appears not to be totally consistent with "every other boot". This is usually the case, but I can have multiple successfull boots in a row, as well as multiple unsuccessfull boots in a row. Please let me know if I can provide log files or anything else that can help resolve this.

After some googling around I tried to add "nolapic" to the boot options yesterday, and after that I have had ZERO black screens.
I tried changing it to "noapic", then I had a black screen - changed back to "nolapic" and everything is ok for the last 15-20 boots.

It CAN be a coincidence, and I'm finding it hard to grasp exactly what this change does - but it could also be a functioning workaround.

cev added a subscriber: cev.Dec 29 2017, 6:27 PM

I fear this may not be the place to ask this, but, is it possible to explain how you add this to the boot options? I'm brand new to linux and I am having the same problem as you.

Thank you very much.

In T4964#97654, @cev wrote:

I fear this may not be the place to ask this, but, is it possible to explain how you add this to the boot options? I'm brand new to linux and I am having the same problem as you.
Thank you very much.

You have to do 3 things:

  1. Edit the file /etc/kernel/cmdline as root (create it if it doesn't exist).
  2. Add the option(s) you need. For me, the only word in this file is "nolapic".
  3. Update the boot manager: sudo clr-boot-manager update.
solus-steve added a comment.EditedDec 30 2017, 8:53 AM

Regarding consistency of black screen issue: With one exception, the behavior has been "first boot fails, second boot succeeds", which is how it behaves now.

Regarding adding boot parameters: Adding nolapic causes either a kernel crash or a hung black screen with "Disabling IRQ #11" message. Adding "noapic" has no effect.

Regarding Ikey's comment about commonality, I am on BIOS, and Intel graphics.

The consistent workaround for this issue appears to be removing the directory /var/log/journal/. This means either deleting it, or moving to journal.old/ as I did. Once done, my experience is that the boot issue disappears. YMMV. Obviously, removing the persistent systemd directory means the journal is not persistent.

Update 12/31/2017: Another day of testing confirms my workaround is effective across dozens of boots.

Update 1/4/2018: Another day of use and there have been no boot failures. Workaround seems effective but a real fix is still needed...

Update: 9 Jan 2018: Still boots fine with workaround.

Update: 3 Feb 2018: BUG STILL EXISTS. But boots fine with workaround (deleting journal directory)

cev added a comment.Dec 31 2017, 1:15 AM

Thank you for taking the time to detail your changes, however it doesn't seem to have the same success for me.

What could be causing the issue? I don't know where to start to be honest.

cev added a comment.Dec 31 2017, 4:47 AM

I managed to narrow down the problem to the reboot process. If I shutdown and turn it on, everything is normal regardless of which cmdline parameters I input.

I'm not sure what to do with this information, but perhaps it can help.

solus-steve updated the task description. (Show Details)Jan 2 2018, 7:05 PM
bwat47 added a comment.Jan 6 2018, 5:53 PM

I came across this old fedora issue that sounds really similar to this: https://bugzilla.redhat.com/show_bug.cgi?id=1006386

solus-steve added a comment.EditedJan 20 2018, 7:32 PM
In T4964#98900, @bwat47 wrote:

I came across this old fedora issue that sounds really similar to this: https://bugzilla.redhat.com/show_bug.cgi?id=1006386

I read that issue and it does seem very similar. Thanks! I went ahead and recreated my systemd journal directory and rebooted. systemd became saving entries there and after many reboots, my ticketed problem has not returned. Update 3 Feb 2018 - Problem returned. Had to delete journal directory to boot every time.

This makes me wonder if Solus does hang up on vacuuming the journal file at some point.

Is this still an issue on 4.18?

To be honest I do not know. It was an issue for so long that I decided to leave Solus, but I am willing to try it again.

Is this still an issue on 4.18?

How can I install 4.18 now and test? I have installed latest updates and I only got 4.17

@stigarn is in unstable now, either switch the repo or wait till the next sync on Friday to test

Sorry, got my Fridays mixed up. Let me know after the Friday sync, thanks!

I've installed 4.18 from unstable now on my laptop which has this problem and it still exists.

This still frequently happens on my xps 13 9360. even with a fresh solus 3.9999 install

Unknown Object (User) removed a subscriber: Unknown Object (User).Oct 24 2018, 11:52 PM
bwat47 added a comment.Nov 7 2018, 3:29 PM
This comment was removed by bwat47.
alecbcs added a subscriber: alecbcs.

I can confirm that I'm still experiencing this on a fresh Solus 3.9999 install with the latest updates. (Dell XPS 9560, UEFI and Intel Graphics)

@bwat47 Thanks for the suggestion. I actually ended up implementing this fix which has appeared to have fixed the issue! Hopefully we can fix this in the repository soon.

DataDrake closed this task as Resolved.Dec 27 2018, 8:19 PM
DataDrake claimed this task.
DataDrake lowered the priority of this task from High to Needs More Info.

I believe this is resolved via the other fix I made to the nvidia packages. If it's still an issue on Intel, let's track it in a different task.