Page MenuHomeSolus

OpenCL AMD Driver
Closed, WontfixPublic

Description

Please add the OpenCL userspace driver as provided in the amdgpu-pro driver stack. It allows you to fully use and enable your GPU to use OpenCL. The current xorg amdgpu driver does not provide this, and the ocl-icd package does not work correctly with a lot of software. I'm pretty sure it is open source, but comes from the amdgpu-pro drivers.

https://aur.archlinux.org/packages/opencl-amd/

Event Timeline

DataDrake claimed this task.
DataDrake added a subscriber: DataDrake.

We cannot legally redistribute the amdgpu-pro driver per 3.2

3. RESTRICTIONS
Except for the limited license expressly granted in Section 2 herein, You have no other rights in the Software, whether express, implied, arising by estoppel or otherwise. Further restrictions regarding Your use of the Software are set forth below. You may not:
  1. modify or create derivative works of the Software;
  2. distribute, assign or otherwise transfer the Software;
  3. decompile, reverse engineer, disassemble or otherwise reduce the Software to a human-perceivable form (except as allowed by applicable law);
  4. alter or remove any copyright, trademark or patent notice(s) in the Software; or
  5. use the Software to: (i) develop inventions directly derived from confidential information to seek patent protection; (ii) assist in the analysis of Your patents and patent applications; or (iii) modify existing patents.;
  6. use, modify and/or distribute any of the Software so that any part becomes subject to a Free Software License.

Source: https://support.amd.com/en-us/download/gpu-pro-eula

AMD OpenCL support is provided via the ROCm platform and AMDKFD.

Ok. How do I install ROCm or AMDKFD as it is not supported for Solus? Is there a way?

AMDKFD is already available in our linux-current kernel package as of the 4.17.2 update. I will personally be working on ROCm as soon as work quiets down in a week or two.

From everything I've read in the docs, it is known to work with the Fiji (R9 Fury) and Polaris (400/500 series) GPUs with Vega10 support coming in kernel 4.18. Older Tonga (2X5) and Hawaii (300 series) cards may be supported, but are listed as experimental. Older cards are not supported at all. I believe that to also be the case with the amdgpu-pro drivers.

Hey, did you ever add ROCm? I dont mean to bother you

Sync got delayed, so I didn't want to land those changes and push it farther back.

EDIT: I will comment here when I finish.

Hi everyone. I do not know, how much work you already put into getting ROCm to run on Solus. I did some tests recently and I want to share my experiences.

  1. Numactl libraries needs to be installed: git clone https://github.com/numactl/numactl.git Do sh autogen.sh, ./configure, make, sudo make install . For some reason, libraries are not found afterwards. Even after writing path to /etc/ld.so.conf.d/roc.conf and ldconfig. So I created symbolic links:
    • sudo ln -s /usr/local/lib/libnuma.so /usr/lib/libnuma.so
    • sudo ln -s /usr/local/lib/libnuma.so /usr/lib/libnuma.so.1
    • sudo ln -s /usr/local/lib/libnuma.so /usr/lib/libnuma.so.1.0.0
  1. ROC-Thunk-Interface: For the mainline kernel (>= 4.17) a special branch (fxkamd/drm-next-wip) needs to be used. Details: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/37
  1. ROC-Runtime:
    • install libelf-devel from Solus repository.
    • git clone --branch "roc-1.8.x" https://github.com/RadeonOpenCompute/ROCR-Runtime.git Change into src folder in follow instructions as described on github Readme file. I just changed the CMAKE_PREFIX_PATH. Compile as described and path to libraries (libhsa-runtime64.so) into /etc/ld.so.conf.d/roc.conf
  1. Before testing sample program which comes with ROC-Runtime, get additional libraries:
  1. vector_copy sample program which comes with ROC-Runtime. Add manually include and library path to libhsa-runtime64.so into Makefile. When running vector_copy , some infos are print out in the console and then program crashes with segmentation fault.

I will now post a strace log on the ROCm github forum, maybe they can help.
I hope you had more success on this (-:

I was able to get 1-3 mostly working this morning when I set out to work on it. I spent part of the afternoon trying to get HCC in good standing, but it isn't there yet. I didn't encounter most of the issues you did (ypkg cuts out a lot of hassle), but I ran into others when it came to getting paths right.

I have not experienced any of the issues mentioned in the github task you linked. Best guess is that more of the rocm code made it into AMDKFD in the current releases of the 4.17 series kernels. I haven't tried to build the opencl libs yet, so maybe I'll notice something then. I did manage to get output from roc-smi however, even though I have a Vega FE card.

Needless to say, I'm working on it, but we may be waiting until 4.18 to have the full stack sorted in Solus.

@DataDrake Thank you very much for your response, that are fantastic news !
By the way, if you need some testing, maybe I can help you. I have a RX580 which should be fully supported with Kernel 4.17.

hey chief did you get that OpenCL working yet? I dont see why you can't just add OpenCL itself. Gentoo and Arch did it, look

@owen.geer Like I've already said we can't redistribute the amdgpu-pro. Arch only has it in the AUR (with questionable legality) and Gentoo forces you to download it yourself before it will build a package, which only partly gets around the EULA terms.

Don't ask me again. amdgpu-pro will remain ineligible for inclusion so long as their EULA restricts/prohibits redistribution.

I'm still working on the ROCm stack for AMD support. Currently, HCC doesn't build for us, so it is impossible to build the rocm-opencl-driver. In the meantime, OpenCL should work fine for Nvidia and Intel.

Should we expect OpenCL to work on AMD cards the in a week or two? Linux 4.18 has received its third point release.

EDIT: my message looks like I'm putting pressure on you. I'm not. What it means is, IIRC you said to be waiting for 4.18 for a better ROCm support.

4.18 is no longer the problem. Waiting for ROCm to actually be buildable.

@DataDrake : I guess you had a lot of work regarding moving the infrastructure and you did a great work on that. I just want to know if there are any news on the AMD ROCm topic ? Last Friday AMD released ROCm 1.9, the version that should fully support Kernels since version 4.17.

For some progress on AMD ROCm see my comment here: ROCm tensorflow

No news on ROCm yet. Last I tried, I couldn't get HCC to compile which is essential for the opencl driver.

There are 2 great projects that could help in a big way with building it on Solus: packages for Fedora and rocm-arch.
For now I spent a few hours basing my work on the Fedora package, and managed to build a small bit of necessary packages. I have encountered problems rocm-compilersupport that seem to be related to our LLVM, will try to work on it when I have time.
Just wanted to point out it might be a good chance to try with rocm once again.
Here are is my humble work: rocm-Solus.
Edit: I am getting an error:

[ 69%] Building CXX object CMakeFiles/amd_comgr.dir/src/comgr-symbolizer.cpp.o
/home/build/YPKG/root/rocm-compilersupport/build/ROCm-CompilerSupport-rocm-5.1.0/lib/comgr/src/comgr-compiler.cpp:484:16: error: no member named 'initSections' in 'llvm::MCStreamer'
    Str.get()->initSections(Opts.NoExecStack, *STI);
    ~~~~~~~~~  ^
/home/build/YPKG/root/rocm-compilersupport/build/ROCm-CompilerSupport-rocm-5.1.0/lib/comgr/src/comgr-compiler.cpp:608:60: error: too many arguments to function call, expected 4, have 5
  bool LLDRet = lld::elf::link(ArgRefs, LogS, LogE, false, false);
                ~~~~~~~~~~~~~~                             ^~~~~
/usr/include/lld/Common/Driver.h:41:6: note: 'link' declared here
bool link(llvm::ArrayRef<const char *> args, bool canExitEarly,
     ^
/home/build/YPKG/root/rocm-compilersupport/build/ROCm-CompilerSupport-rocm-5.1.0/lib/comgr/src/comgr-compiler.cpp:609:8: error: no member named 'CommonLinkerContext' in namespace 'lld'
  lld::CommonLinkerContext::destroy();
  ~~~~~^

Arch needs to build rocm-llvm, Fedora does not, so I really hope I could get by without normal llvm

Should it be packaged in /opt/rocm as the official and Arch packages do, or in system directories like Fedora does? I tried going for the second approach but this mean getting rid of bundled ICD and moving headers to not conflict with other packages. And I am wondering if any of that will cause headaches in the future.
In my opinion all the packages are mostly done and ready to be reviewed, with the exception of rocm-opencl which I need help with testing.

The packaging definitely needs some cleanup, but I think it should be usable at this point. If anyone could test or provide any feedback, it would be really appreciated. [[ https://github.com/JacekJagosz/rocm-Solus/releases/tag/0.1 | Here are the .eopkgs so you don't need to compile ]], especially as last 2 packages need a patched ypkg.

So after a lot of research it turns out that using system's LLVM is reducing GPU compatibility. My personal APU is not yet supported in LLVM 13 that we are using. It also didn't work on @davidjharder's rx 480, but that is normal, Polaris has only partial support.
So GPUs supported by rocm are desktop Vegas and RDNA 1-2s. Could anyone with such GPU test it? Otherwise I think the package are fully functional, and only maybe rocm-opencl can be split into more packages, but otherwise it should be done.

I don't mind using the bundled llvm if it can be easily switched to if it helps with support.

In my opinion now the build is done, it should already be working with Vega, and is only waiting for LLVM 14 for full GPU support. I have tested it with new LLVM, it builds fine, so should be ready.
Edit: ypkg and solbuild pull requests allowing for renaming of sources need to be accepted too

I have updated all of it to 5.1.3, which is the newest version we can use with LLVM 14, and 5.2 doesn't build with it and will probably require LLVM 15.
After the changes I made to rocm-hip and finally managing to build a package depending on it I am confident in the whole stack and I hope it can land once LLVM gets updated to 14, and solbuild gets updated to latest git.