Page MenuHomeSolus

[RFC] Enable PGO for Clang
Needs ReviewPublic

Authored by joebonrichie on Sat, Dec 1, 12:56 PM.

Details

Reviewers
sunnyflunk
Group Reviewers
Triage Team
Summary

Revert to the old way of building llvm and enable PGO, as follows:

  • Build a minimal stage1 compiler against system clang
  • Build a stage2 instrumented compiler with LLVM_BUILD_INSTRUMENTED using stage1 compiler
  • Run check-clang to generate profile files and merge into one file with llvm-profdata using stage2 instrumented compiler
  • Build a stage2 profiled compiler passing our profile data with LLVM_PROFDATA_FILE using stage1 compiler
  • Build a 32bit build using our stage2 profiled compiler and enable LLVM_BUILD_32_BITS instead of manually doing 32bit

Quick n' Dirty Benchmarks (i'm really tired of compiling things)
zstd time to compile
PGO
real 0m24.707s
user 0m48.666s
sys 0m0.766s
default
real 0m36.127s
user 1m25.298s
sys 0m1.456s

libwebkit2gtk time to compile
PGO
real 36m25.341s
user 119m30.699s
sys 8m7.484s
default
real 40m47.430s
user 138m5.131s
sys 8m48.744s

32bit now links against 32bit ncurses and 32bit libedit and requires 32bit
libstdc to be added to builddeps whereas it didn't before.

Signed-off-by: Joey Riches <josephriches@gmail.com>

Test Plan

Compiled zstd and webkit against this

Diff Detail

Repository
R1972 llvm
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
joebonrichie created this revision.Sat, Dec 1, 12:56 PM
joebonrichie requested review of this revision.Sat, Dec 1, 12:56 PM
joebonrichie updated this revision to Diff 11063.
joebonrichie retitled this revision from [RFC] Enable PGO for Clang Revert to the old way of building llvm and enable PGO, as follows: - Build a minimal stage1 compiler against system clang - Build a stage2 instrumented compiler with `LLVM_BUILD_INSTRUMENTED` using stage1 compiler - Run... to [RFC] Enable PGO for Clang.
joebonrichie edited the summary of this revision. (Show Details)

Fixup summary

sunnyflunk requested changes to this revision.Thu, Dec 6, 12:01 AM

I have no idea why the LLVM_7 and openmp version symbols are disappearing.

This seems to be an essential symbol and my scan showed is used by every package linking to llvm. I tried throwing a PGO build with emul32 and seemed to be rather unsuccessful without having the LLVM_INSTRUMENTED flag set
.
Making require changes as I don't believe it can land without those symbols

This revision now requires changes to proceed.Thu, Dec 6, 12:01 AM

i'll do a build without 32bit support to see if the symbols reappear first. I was hoping to avoid having to use emul32. Additionally, there is new documentation and a script for building PGO clang upstream which gives a few pointers.

https://github.com/earl/llvm-mirror/commits/master/utils/collect_and_build_with_pgo.py

I was thinking about how good the coverage is of the compiler tests ensuring good coverage of languages and compiler flags. It looked like GCC was profiled building itself only, which may not improve LTO, C++ as much as it could.

joebonrichie updated this revision to Diff 11187.EditedThu, Dec 6, 7:15 PM
joebonrichie edited the summary of this revision. (Show Details)

Implement PGO hints and improvements from
http://llvm.org/docs/HowToBuildWithPGO.html
https://github.com/llvm-mirror/llvm/blob/master/utils/collect_and_build_with_pgo.py

Detailed Changes:

  • Only build clang llvm-profdata profile for a very minimal stage1 compiler
  • Use IR profiling since it apparently gives slightly better results see here
  • Set LLVM_BUILD_RUNTIME=NO to disable libcxx from building during stage2-intrumentation to optimize build (also prevents a bug if LDFLAGS is set)
  • Reenable clang for the 32bit build to see if the disappearing LLVM-$version symbol reappears (it doesn't)
  • Do longer call cmake args which are already default to reduce LOC
  • Call check-llvm in addition to check-clang to improve coverage

TODO:

  • See if 32bit clang can be disabled again
  • See what the code coverage is like of the profiling and improve it by running benchmarks or possibly compiling clang again as a dummy stage3 step. Want to have about 40-50%~ coverage of x86 clang about at least 20% overall coverage (very rough numbers).
  • We can now look at enabling lto in the stage2-profiled step (LLVM_ENABLE_LTO).

OTHER:

  • libcxx and libcxxabi are purposely not being targetted for profiling for now until we enable LLVM_ENABLE_LIBCXX