Page MenuHomeSolus

Stuck in login loop
Closed, ResolvedPublic

Description

A few weeks ago, I booted in to Solus (dual-booted machine) only to find that I could no longer login. I was presented with the greeter, as usual, but upon entering my password and pressing Enter, the screen would go black for a few seconds as if it were just about to show the desktop... and then return right back to the greeter.

I cannot remember exactly what happened before this, but I think I installed the weekly updates the last time I used Solus.

Some things I have tried via tty after reading up, but to no avail:

  • eopkg up
  • eopkg check | grep Broken | awk '{print $4}' | xargs eopkg it --reinstall && eopkg up
  • eopkg rdb && eopkg ur -f && eopkg up --ignore-comar && eopkg cp
  • usysconf run -f
  • eopkg hs -t (this didn't work because the update in question that I am rolling back to was the one where youtube-dl was removed from the repos. Every time I attempt this, it fails because youtube-dl could not be found. When I try to reinstall the old packages individually to avoid this, I get errors that none of the packages can be found.)

I also checked the lightdm logs, and it seemed there were no errors (though, I don't quite know what to look for). Additionally, I checked systemctl status lightdm, and it said it was active.

I should add that I'm using the Budgie DE. Somebody mentioned that might be an important detail.

Event Timeline

I had similar issue a while ago after a kernel update (linux-current package).
In my case I was booting into wrong kernel, which I discovered by running following command from https://getsol.us/articles/troubleshooting/general-troubleshooting/en/#display-manager-wont-start

eopkg info linux-current | head -n2; uname -a

Can you check if the above command (or the one for linux-lts) shows two matching kernel versions?

Thanks for this! After running this command, it does turn out that there's a mismatch between the kernel versions. I am running 5.6.13-153, but the latest I have installed is 5.6.19-159.

I ran sudo CBM_DEBUG=1 clr-boot-manager update as stated in the article, and everything seemed to be fine. I checked out my ESP and both kernels are there. 5.6.19-159 seems to be set as the default, so I really don't know why it wouldn't be booting into it.

I also checked efibootmgr to make sure the entry in the boot menu for Solus is pointing to the bootloader, which it is.

Any ideas?

Could be a similar issue to T9327. But I would wait until someone with more experience confirms my guess.
Maybe @DataDrake would be interested in this.

Meanwhile you can check if your system has similar symptoms as described in that issue, such as non-empty /boot directory.

Thanks! I had exactly the same issues with my /boot directory being filled with the newer kernels, while my EFI partition had the outdated kernels.

Followed the instructions in that issue and it seemed to work well. My EFI partition now looks how it's supposed to, and my /boot directory is now empty. Unfortunately, I am still having the login issue.

Did the above steps completely solve the login issue for you?

liontiger23 added a comment.EditedNov 25 2020, 4:13 AM

So, the way I worked around the issue initially, before fixing boot partition, is I manually copied configs for the new kernel from /boot directory to boot partition and then changed the kernel via clr-boot-manager to the new one.
For you likely no manual copying is needed, since you already fixed the boot partition

Right now this is what clr-boot-manager list-kernels shows on my system:

$ sudo clr-boot-manager list-kernels
* com.solus-project.current.5.6.19-159
  com.solus-project.current.5.6.19-158

with the current 159 release selected.
Can you check if the new kernel shows up on your system?
If it shows up but not selected (not marked with the *), then you can try to boot into it following https://getsol.us/articles/troubleshooting/general-troubleshooting/en/#boot-into-previous-kernel.
Then if it successfully boots and you can login into DE, you can switch permanently to this kernel by running:

sudo clr-boot-manager update

My understanding is that it switches default kernel to the currently booted one.

If the latest kernel doesn't show up in the list, then I'd try reinstalling linux-current using

sudo eopkg it --reinstall linux-current

and then trying to boot into the system.

When I was troubleshooting it, I have also set boot timeout to 5 seconds, so that I don't need to mash any buttons to access the boot menu:

sudo clr-boot-manager set-timeout 5

At first, clr-boot-manager list-kernels listed both kernels I had installed (5.6.13-153 and 5.6.19-159), with the latest one selected. Unfortunately, even at that, I wasn't able to log in. I always returned right back to the login screen after a few seconds. (Also, as an aside, on my ESP I had a third kernel, something like 5.4.6. For some reason, that one wasn't showing up in the output of list-kernels).

I ran eopkg it --reinstall linux-current and then clr-boot-manager update to see if that would do anything. Upon running list-kernels again, there was now only 5.6.19-159 listed. I checked my ESP and it seems the file for 5.6.13-153 had been removed from there too. Still not able to log in after this step.

Then I tried running eopkg it --reinstall for all my NVIDIA driver packages to see if it would recompile them against the current kernel. Again, no success.

Finally, at the bootloader, I selected the older kernel that was listed (the 5.4.6 one mentioned above) to see if maybe I could boot into that. All I got was a blank screen (I suspect that's the expected result).

Very strange indeed. Should I maybe try uninstalling my NVIDIA drivers and seeing if that does anything?

You can also check if the installed drivers are for the -current kernel and not for -lts -- see table at https://getsol.us/articles/troubleshooting/boot-management/en/#installing-an-alternative-kernel.

Yeah, all my drivers are for the current stream of kernels. Is it maybe worth installing the lts kernels, uninstalling all my old NVIDIA drivers and installing the latest lts ones?

Is it maybe worth installing the lts kernels, uninstalling all my old NVIDIA drivers and installing the latest lts ones?

You would also need to install linux-lts and boot into that kernel to try.

Removing drivers completely and rebooting might also be worth trying.

However, if you were able to boot into 5.6.19-159 and gpu drivers are all for that kernel, then this might not be a gpu related issue at all.
Please double-check if /var/log/lightdm/* logs show any errors.
There might also be issues with owner or permissions of .Xauthority or /tmp directory which prevent DE from starting up -- https://www.maketecheasier.com/fix-ubuntu-login-loop/

I'm not too sure what I'm looking for, but there don't seem to be any errors. The only one that jumped out to me was an xkbcomp error in the X server logs, but even that one it says is non-fatal. I've attached all the logs I found in /var/log/lightdm (minus the ones suffixed with .old).

As for the permissions on .Xauthority and /tmp, everything was as it should be. I ran the steps in the article you linked just to be sure, but still no dice, unfortunately.

I'll try to install the lts stream kernel now and change my NVIDIA drivers over for that. I'll report back with how that goes.

Just finished trying to install the latest lts kernel and corresponding NVIDIA drivers to see if it would make a difference, but the exact same problem happens regardless.

From the lightdm.log:

[+51.51s] DEBUG: Session pid=1425: Logging to .xsession-errors
[+51.79s] DEBUG: Activating VT 7
[+51.79s] DEBUG: Activating login1 session 1
[+51.79s] DEBUG: Seat seat0 changes active session to 
[+51.79s] DEBUG: Seat seat0 changes active session to 1
[+51.79s] DEBUG: Session 1 is already active
[+61.38s] DEBUG: Session pid=1425: Exited with return value 2

The last line here indicates that something went wrong with the process with pid=1425.
Can you check if ~/.xsession-errors file has anything in it?

Thank you so much for all the help, that was it! Turns out the GitHub CLI (gh) completions I was loading in my .bashrc were malformed, causing the file to be unable to load.

jacobprudhomme closed this task as Resolved.Nov 27 2020, 5:02 PM

You are welcome!
Glad I was able to help.