Page MenuHomeSolus

Poor performance after kernel 4.8.x update
Closed, ResolvedPublic

Description

I noticed a severe drop in performance after updating to kernel 4.8.1 a couple of weeks ago. I didn't have the time to investigate and so rolled back to 4.7.7. I thought I'd retry with 4.8.4. Same issue unfortunately. /proc/cpuinfo never reports CPU frequency above 800 MHz (except within the first couple minutes of use). I'm running on an i7-4650U with intel_iommu=off (due to boot issues).

Event Timeline

veiyae created this task.Oct 23 2016, 10:18 PM
veiyae updated the task description. (Show Details)

What boot issues? iommu gfx is off by default "igfx_on" switches it on. Tried with that..?

No doesn't help. I don't think it has to do with the GPU. Without intel_iommu=off I see a lot of:

DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Write] Request device [04:00.1] fault addr fffe0000 [fault reason 02] Present bit in context entry is clear

in and amongst output related to ata1. Pretty sure it has something to do with the SATA controller.

It seems to be throttling the CPU (and holding it in this state) after ~1 minute of use under moderate load. Before that performance is as expected. It's really very odd since I was running 4.7.7 without any issues.

I've noticed more throttling as of lately on my i7-6700HQ; I assumed it to be dust or something but now that I think about it, I've only noticed this happening more often as of lately.

veiyae added a comment.EditedOct 23 2016, 11:34 PM

I've just tried intel_pstate=disable with the hope that the ACPI CPUfreq would do better, but I'm having the same issue. It doesn't seem to be the intel_pstate driver.

Food for thought; Try removing tlp+thermald and see if you get any improvements

That did it! After:

  • tlp 0.9-8-1-x86_64 is removed.
  • solus-hardware-config 6-9-1-x86_64 is removed.
  • thermald 1.5.3-2-1-x86_64 is removed.

I have a functioning system again. I'm still not too sure what the issue was however, and I feel that removing thermald might be a questionable. Thoughts?

So now you need to try with each package to verify who is the evil bastid doing this.

Reboot between each step:

  • So try with tlp installed, Reboot, check performance.
  • Remove tlp, install thermald, Reboot, check performance.
  • Install solus-hardware-config (brings in thermald) and install tlp too. Reboot, check performance.

My money right now is in tlp but we'll see. If you discover which one before doing the other steps, just stop there and let me know

JoshStrobl edited projects, added Software; removed Triage Team.Oct 24 2016, 8:20 AM

It's thermald, i.e. the issue occurs with only thermald installed. So is thermald reporting inaccurate temperature readings to P-state?

Possible. I'll remove thermald as a hard dep of solus-hardware-config

DataDrake triaged this task as High priority.Oct 31 2016, 8:01 PM
DataDrake moved this task from Backlog to Package Fixes on the Software board.
Espionage724 added a comment.EditedNov 14 2016, 1:59 AM

Could this be related to "BD PROCHOT"? From what I've heard, it seems that is the cause of unnecessary throttling in some computers.

On Windows, disabling the feature with ThrottleStop is the usual recommendation for this sort of problem it seems, but I don't know of anything for Linux exactly.

As for the thermald package; masking the systemd service could work too right? At boot after the encryption passphrase is entered, I see a brief mention of CPU throttling and then returning to normal speed even after masking thermald. However, from within the OS, I don't seem to see any throttling messages in dmesg (Guild Wars 2 in Wine would trigger it pretty quickly in the past).

That's an interesting idea. Apparently one can use msr-tools (specifically wrmsr on address 0x1FC) to disable throttling caused by BD PROCHOT. From what I understand, if 'sudo rdmsr --decimal 0x1FC' reports an odd integer, then the CPU is throttled (unsure about this). Subtracting 1 from this value and writing to 0x1FC (i.e. 'sudo wrmsr 0x1FC <value-1>') has been reported to work. ThrottleStop does the same thing (modifies MSR 0x1FC bit[0]). Any thoughts? Worth a go? I'm still running without thermald tbh. I don't do anything intensive on this machine, so heat isn't a huge concern.

ikey closed this task as Resolved.Feb 4 2017, 11:46 PM
ikey claimed this task.

OK so TL;DR Any issues with this, please just remove or unmask thermald - problem solved