Page MenuHomeSolus

Nvidia driver not being used after lastest nvidia-glx-driver update
Closed, ResolvedPublic

Description

Since the update of nvidia-glx-driver to 410.X my system is using llvmpipe instead of the Nvidia driver. I'm using a GTX 970.

$ sudo glxinfo | grep renderer
    GLX_MESA_multithread_makecurrent, GLX_MESA_query_renderer, 
    GLX_EXT_visual_rating, GLX_MESA_copy_sub_buffer, GLX_MESA_query_renderer, 
Extended renderer info (GLX_MESA_query_renderer):
OpenGL renderer string: llvmpipe (LLVM 6.0, 256 bits)
$ lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1)

in dmesg:

[    1.937683] nvidia: loading out-of-tree module taints kernel.
[    1.937693] nvidia: module license 'NVIDIA' taints kernel.
[    1.937693] Disabling lock debugging due to kernel taint
[    1.952190] nvidia-nvlink: Nvlink Core is being initialized, major device number 245
[    1.952492] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[    2.980116] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  410.66  Wed Oct 10 12:30:32 CDT 2018
[    2.982312] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[    2.982314] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0
[    3.004645] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 241

Xorg.0.log:

Event Timeline

sunnyflunk added a subscriber: sunnyflunk.EditedOct 26 2018, 8:35 PM

Please provide the output of these:

linux-driver-management status

inxi -G

ls -al /usr/lib64/nvidia/modules/

Gaming4LifeDE added a comment.EditedOct 26 2018, 8:39 PM
$ linux-driver-management status
 ╒ Hardware Platform
 ╞ Platform Vendor : Gigabyte Technology Co., Ltd.
 ╘ Platform Model  : Z97-HD3

Simple GPU configuration

 ╒ Primary GPU
 ╞ Device Name   : GM204 [GeForce GTX 970]
 ╞ Manufacturer  : NVIDIA Corporation
 ╞ Product ID    : 0x13c2
 ╞ Vendor ID     : 0x10de
 ╞ X.Org PCI ID  : PCI:1:0:0
 ╘ Boot VGA      : yes

LDM Providers for GM204 [GeForce GTX 970]: 2
 -  nvidia-glx-driver
 -  nvidia-390-glx-driver
$ inxi -G
Graphics:  Device-1: NVIDIA GM204 [GeForce GTX 970] driver: nvidia v: 410.66 
           Display: x11 server: X.Org 1.20.3 driver: modesetting FAILED: nvidia 
           resolution: 1680x1050~60Hz, 3840x2160~60Hz 
           OpenGL: renderer: llvmpipe (LLVM 6.0 256 bits) v: 3.3 Mesa 18.2.3 
$ ls -al /usr/lib64/nvidia/modules/
total 13696
drwxr-xr-x 2 root root     4096 26. Okt 21:04 .
drwxr-xr-x 3 root root     4096 26. Okt 21:04 ..
-rw-r--r-- 1 root root 14013096 21. Okt 17:49 libglx.so
lrwxrwxrwx 1 root root        9 26. Okt 21:04 libglx.so.1 -> libglx.so
sunnyflunk triaged this task as Normal priority.Oct 26 2018, 9:42 PM
sunnyflunk edited projects, added Hardware; removed Lacks Project.

That is definitely not the latest repo package of the drivers!

sudo eopkg up
sudo eopkg install --reinstall nvidia-glx-driver-common

You'll likely also want to reinstall nvidia-glx-driver-current if using the current kernel, nvidia-glx-driver if using lts kernel and nvidia-glx-driver-32bit if using the 32bit drivers.

eopkg info nvidia-glx-driver-common

Should show release 266 being installed

Note while the above did solve it, leaving it open for a little bit for better visibility and contemplate whether this oddity is triggered by more users

I have pushed a fresh driver version to shannon, Which will force a clean update without any delta.

crom5 added a subscriber: crom5.Oct 27 2018, 8:49 AM

I have pushed a fresh driver version to shannon, Which will force a clean update without any delta.

After the small update this morning, it seems to me that the situation is still not ok?

$ sudo glxinfo | grep renderer
 
    GLX_MESA_multithread_makecurrent, GLX_MESA_query_renderer, 
    GLX_EXT_visual_rating, GLX_MESA_copy_sub_buffer, GLX_MESA_query_renderer, 
Extended renderer info (GLX_MESA_query_renderer):
OpenGL renderer string: llvmpipe (LLVM 6.0, 256 bits)

$ lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GP104M [GeForce GTX 1070 Mobile] (rev a1)

$ linux-driver-management status
 ╒ Hardware Platform
 ╞ Platform Vendor : ASUSTeK COMPUTER INC.
 ╘ Platform Model  : G752VS

Simple GPU configuration

 ╒ Primary GPU
 ╞ Device Name   : GP104M [GeForce GTX 1070 Mobile]
 ╞ Manufacturer  : NVIDIA Corporation
 ╞ Product ID    : 0x1be1
 ╞ Vendor ID     : 0x10de
 ╞ X.Org PCI ID  : PCI:1:0:0
 ╘ Boot VGA      : yes

LDM Providers for GP104M [GeForce GTX 1070 Mobile]: 2
 -  nvidia-glx-driver
 -  nvidia-390-glx-driver

$ inxi -G
Graphics:
  Device-1: NVIDIA GP104M [GeForce GTX 1070 Mobile] driver: nvidia v: 410.73 
  Display: x11 server: X.Org 1.20.3 driver: modesetting FAILED: nvidia 
  resolution: 1920x1080~75Hz 
  OpenGL: renderer: llvmpipe (LLVM 6.0 256 bits) v: 3.3 Mesa 18.2.3

$ ls -al /usr/lib64/nvidia/modules/
total 14656
drwxr-xr-x 2 root root     4096 Oct 27 10:30 .
drwxr-xr-x 3 root root     4096 Feb  6  2018 ..
lrwxrwxrwx 1 root root       29 Oct 27 10:30 libglxserver_nvidia.so -> libglxserver_nvidia.so.410.73
lrwxrwxrwx 1 root root       29 Oct 27 10:30 libglxserver_nvidia.so.1 -> libglxserver_nvidia.so.410.73
-rwxr-xr-x 1 root root 14997128 Oct 27 02:53 libglxserver_nvidia.so.410.73
In T7098#133924, @crom5 wrote:

I have pushed a fresh driver version to shannon, Which will force a clean update without any delta.

After the small update this morning, it seems to me that the situation is still not ok?

Correct, users who are having trouble will likely need to switch to the nvidia-390-glx-driver-current to restore the previous functionality. It seems to be common with the 1070 card in particular.

crom5 added a comment.Oct 27 2018, 9:38 AM

Thank you, can you please give a step-by-step guide how to switch back to nvidia-390-glx-driver-current?

sudo eopkg rm nvidia-glx-driver-common
sudo eopkg install nvidia-390-glx-driver-current

If you want the 32bit drivers

sudo eopkg install nvidia-390-glx-driver-32bit

Done. I'm not receiving any system message about available updates - is my system going to be updated automatically to the next 410 driver release, or I will must to intervene manually (how)?

GTX 1060 here, I have the same issue (resolved with 390).

This seems like it's happening due to a race condition issue as dicussed with @sunnyflunk on IRC.
It seems to happen randomly when rebooting. Sometimes it loads the NVIDIA driver, sometimes it loads llvmpipe.

In T7098#133937, @crom5 wrote:

Done. I'm not receiving any system message about available updates - is my system going to be updated automatically to the next 410 driver release, or I will must to intervene manually (how)?

No it won't, it will remain on the working 390 drivers which is a good thing. It's unknown how long the branch will impact on various cards

This seems like it's happening due to a race condition issue as dicussed with @sunnyflunk on IRC.
It seems to happen randomly when rebooting. Sometimes it loads the NVIDIA driver, sometimes it loads llvmpipe.

I certainly wasn't convinced of this, but posited it as a concept as it is random whether it boots or not. However, as it almost always fails to work, I'm less convinced.

It worked for me when i booted the pc last time. Is there any way to get more information to be certain about what's going on?
I tried to reboot several times and it seems to be completely random.

The Xorg.0.log showing a different load order of the modules (when it successfully boots) would be a start.

Here you go:

$ inxi -G
Graphics:
  Device-1: NVIDIA GM204 [GeForce GTX 970] driver: nvidia v: 410.73 
  Display: x11 server: X.Org 1.20.3 driver: nvidia 
  resolution: 1680x1050~60Hz, 3840x2160~60Hz 
  OpenGL: renderer: GeForce GTX 970/PCIe/SSE2 v: 4.6.0 NVIDIA 410.73 

sunnyflunk added a comment.EditedOct 28 2018, 10:58 AM

New test. Have it boot and load with llvmpipe and go to TTY Ctrl+Alt+F2. Login and run one of these depending which one you have installed (note these will close your session).

sudo systemctl restart lightdm
sudo systemctl restart gdm
sudo systemctl restart sddm

I'm almost certain this will boot with nvidia after doing so.

Also want to know CPU and Hard drive.

And you would be correct, it happened exactly as you said.
I have a Intel Core i7 4770 and Solus is booting off a Samsung 950 Pro 512GB NVME SSD but I also have a second SATA SSD plugged in which is being mounted via gvfs at boot.

So this is triggered by hardware, but the common theme will be NVME SSD I imagine. It looks like it's booting (1.9s) the X server before its brought up the dri driver (which according to the logs will be somewhere just over 2s).

What's the output of dmesg

@Gaming4LifeDE I'd also be very interested in your Xorg.0.log and dmesg output when booting with the 390 drivers as a comparison if you're up for it!

The 390 output will follow in 30mins - an hour

@sunnyflunk Here are your files for 390

So the main difference seems to be that the nvidia driver loads 0.25s later with the 410 drivers. I am assuming you are using lightdm? If so can you edit this file /usr/lib64/systemd/system/lightdm.service and change it to this:

[Unit]
Description=Display Manager
Documentation=man:lightdm(1)
Conflicts=getty@tty7.service
After=systemd-user-sessions.service getty@tty7.service plymouth-quit.service systemd-logind.service

[Service]
ExecStart=/usr/sbin/lightdm
Restart=always
IgnoreSIGPIPE=no
BusName=org.freedesktop.DisplayManager

[Install]
Alias=displaymanager.service
WantedBy=graphical.target

Then testing on the 410 drivers, will likely need a couple of boots to confirm any improvement.

@sunnyflunk that didn't work. See the logs here

New test. Have it boot and load with llvmpipe and go to TTY Ctrl+Alt+F2. Login and run one of these depending which one you have installed (note these will close your session).
sudo systemctl restart lightdm
sudo systemctl restart gdm
sudo systemctl restart sddm
I'm almost certain this will boot with nvidia after doing so.
Also want to know CPU and Hard drive.

Did this with lightdm and 410.73 loaded correctly. Using Solus Budgie.

inxi -GCD:

CPU: Topology: Quad Core model: Intel Core i5-3570K bits: 64 type: MCP L2 cache: 6144 KiB

Speed: 1602 MHz min/max: 1600/3800 MHz Core speeds (MHz): 1: 1602 2: 1602 3: 1602 
4: 1602

Graphics: Device-1: NVIDIA GK110 [GeForce GTX 780] driver: nvidia v: 410.73

Display: x11 server: X.Org 1.20.3 driver: nvidia 
resolution: 1920x1080~60Hz, 1920x1080~60Hz 
OpenGL: renderer: GeForce GTX 780/PCIe/SSE2 v: 4.6.0 NVIDIA 410.73

Drives: Local Storage: total: 1.25 TiB used: 11.00 GiB (0.9%)

ID-1: /dev/sda vendor: Samsung model: SSD 860 EVO 250GB size: 232.89 GiB 
ID-2: /dev/sdb vendor: Samsung model: SSD 840 PRO Series size: 119.24 GiB - Solus is here
ID-3: /dev/sdc vendor: Samsung model: HD103SJ size: 931.51 GiB

@aalhitennf Keep in mind that NVME SSDs boot way faster, which might contribute to the problem.

aalhitennf added a comment.EditedOct 28 2018, 1:36 PM

@aalhitennf Keep in mind that NVME SSDs boot way faster, which might contribute to the problem.

I know but i need that space for my windows games..

So this is triggered by hardware, but the common theme will be NVME SSD I imagine. It looks like it's booting (1.9s) the X server before its brought up the dri driver (which according to the logs will be somewhere just over 2s).
What's the output of dmesg

I'm running with the following from inxi -GCD (on 390 now but I had the issue with 410):

CPU:       Topology: Quad Core model: Intel Core i7-6700 bits: 64 type: MT MCP L2 cache: 8192 KiB
           Speed: 3762 MHz min/max: 800/4000 MHz Core speeds (MHz): 1: 3757 2: 3796 3: 3797 4: 3759 5: 3767
           6: 3790 7: 3730 8: 3765
Graphics:  Device-1: NVIDIA GM107 [GeForce GTX 745] driver: nvidia v: 390.87
           Display: x11 server: X.Org 1.20.3 driver: nvidia resolution: 1920x1080~60Hz
           OpenGL: renderer: GeForce GTX 745/PCIe/SSE2 v: 4.6.0 NVIDIA 390.87
Drives:    Local Storage: total: 1.48 TiB used: 894.74 GiB (59.0%)
           ID-1: /dev/sda vendor: SanDisk model: SD7SB3Q-128G-1006 size: 119.24 GiB
           ID-2: /dev/sdb vendor: Western Digital model: WD10EZEX-60M2NA0 size: 931.51 GiB

Just saying this to make it clear that this is not exclusive to NVME

Please try adding:

Section	"Files"
    ModulePath "/usr/lib64/nvidia/modules"
EndSection

To the start of /usr/share/X11/xorg.conf.d/10-nvidia.conf

Please try adding:

Section	"Files"
    ModulePath "/usr/lib64/nvidia/modules"
EndSection

To the start of /usr/share/X11/xorg.conf.d/10-nvidia.conf

Yep, this worked for me. Thank you!

Works for me too.
Look like that's a good fix or are there any problems with it?

apparently it also needs to include the normal modulespath or it will mess with things, but this at least gets around the race condition. I'll submit a full fix to all of our drivers later on today.

Is there a fix now? Also, bump in case it was forgotten.

That's just for the beta driver though, not for the main driver. The fix would be the same but the patch is just applied to beta.

Those changes have been made to all the drivers

In T7098#133937, @crom5 wrote:

Done. I'm not receiving any system message about available updates - is my system going to be updated automatically to the next 410 driver release, or I will must to intervene manually (how)?

No it won't, it will remain on the working 390 drivers which is a good thing. It's unknown how long the branch will impact on various cards

Is the problem fixed now? If yes, how can I return to 410 driver?

@crom5 the nvidia-glx-driver are using 410, you should be able to install this driver over doflicky (if you haven't done this already)

Thank you for your reply. This is the initial description of the problem: "since the update of nvidia-glx-driver to 410.X my system is using llvmpipe instead of the Nvidia driver" - could you please clarify if this problem is fixed now?
If the answer is positive, is this the correct procedure to return to 410 driver:

sudo eopkg rm nvidia-glx-driver-common
sudo eopkg install nvidia-410-glx-driver-current
sudo eopkg install nvidia-410-glx-driver-32bit
DataDrake closed this task as Resolved.Dec 27 2018, 8:26 PM
DataDrake claimed this task.

AFAIK this is resolved now.