Difference between revisions of "LLVM"

Revision as of 15:24, 28 November 2020

The current status of Low Level Virtual Machine (LLVM) support is ready for general testing.

  Note: The information in this section has been updated as of October 2020 and might be outdated by the time you read it

Progress

FPC with an LLVM code generator backend is available in svn trunk. It currently supports the following targets:

Darwin/x86-64
Darwin/AArch64 (macOS, untested on iOS)
Linux/x86-64
Linux/AArch64
Linux/ARMHF

Usage

Install Clang

You can use a version included in an LLVM release available from the official LLVM site, or use a version that comes with Xcode (macOS) or your Linux distribution.

FPC can generate LLVM code that can be compiled with LLVM 7.0 until at least 11.0.

Build FPC with LLVM support

Build FPC as usual, but add LLVM=1 to the make command line, and

Specify the LLVM/Clang version you are using by adding the appropriate -Clv command line parameter to OPTNEW. E.g. OPTNEW="-Clv7.0" (Clang 7.0) or OPTNEW="-ClvXcode-9.3 (the Clang that ships with Xcode 9.3). The latest supported versions currently are Clang 11.0 and Xcode 11.0, but it is quite possible that the generated code will be compatible with later versions. If clang accepts the generated code, it should work fine.

Warning: If you specify any custom parameters through OPT when building FPC, then also add those in OPTNEW. The reason is that during the compilation of the packages directory only the contents of OPTNEW will be used (it's hard to change this because during cross-compiling there are special requirements to build the fpmake utility using the bootstrap compiler).

FPC uses clang to "assemble" the generated LLVM IR. If your clang binary has a custom suffix (as is common on many Linux distributions), you can use the -XlS<x> parameter to specify this suffix. E.g. -XlS-7 in case the clang binary is called clang-7.
If you use a custom installed LLVM version, specify the path to its Clang binary using FPC's -FD command line option (add it to the make OPTNEW options). The compiler will also use this path to find the LTO library if needed (see later).
- In this case, also add the path to the custom clang binary to your $HOME/.fpc.cfg file (so it will be found when you compile code with the compiler after the "make" command has finished), e.g. like this:

 #include /etc/fpc.cfg
 #ifdef CPULLVM
 -FD/Users/Me/clang+llvm-8.0.0-x86_64-apple-darwin/bin
 #endif

On Linux, also add the path to libgcc_s your '$HOME/.fpc.cfg' file. E.g. on Ubuntu 16.04: -Fl/usr/lib/gcc/x86_64-linux-gnu/5', similar to the above.

Installing FPC with LLVM support

FPC built with LLVM support does not include the built-in code generator. Additionally, installing it will currently overwrite any FPC with the same version number that's installed in the same prefix (target directory). As the units generated by FPC with the LLVM backend are not compatible with those used by FPC with the built-in code generator, it's better to install such a version in a different prefix (target directory) for now. Use the make parameter INSTALL_PREFIX=/xxx/yyy to specify this prefix. As above, you can use custom block in your '$HOME/.fpc.cfg' to specify the alternative unit directories:

 #ifdef CPULLVM
 -Fu/yourLLVMinstallPREFIX/lib/fpc/$fpcversion/units/$fpctarget/*
 -Fu/yourLLVMinstallPREFIX/lib/fpc/$fpcversion/units/$fpctarget/rtl
 #endif

Using Link-Time Optimisation (LTO)

Link-time optimisation means that potentially the entire program and all units that it uses are all optimised together.

To compile units with LTO support, or to compile a program or library with LTO, add -Clflto on the compiler command line. If you add this to OPT/OPTNEW when building FPC, all standard units and the compiler itself will also be built with LTO.

Note:

If you compile a unit with LTO, it will also be compiled normally. This means you can use it both for LTO and for normal (static or smart) linking afterwards.
The linker (ld) included with least Xcode 9 until and including 10.1 contain various bugs that cause errors when the system unit is included in the LTO. You can work around this by specifying the -Clfltonosystem command line option in additions to -Clflto.
On Linux, unless you are using your distribution's default LLVM version, you will also have to build the LLVMgold.so linker plugin and place it in the "lib" directory of your custom LLVM installation (it is not shipped as part of the official LLVM installers, because it needs to be built for the binutils you have on your system). See http://llvm.org/docs/GoldPlugin.html for more information.

Open tasks

Only a few platforms are supported right now, but more can be added. Windows will be harder, because it requires support for generating SEH-style LLVM-based exception handling code. Other platforms should be "reasonably" easy (the parameter handling needs to be generalised more though).
add support for the (experimental) llvm.experimental.constrained.* intrinsics, to properly support floating point rounding modes and exceptions
- This has been partly done, but due to their experimental nature not all of them supported on all target platforms
add support for automatically outlining try-blocks into "noinline" nested functions, so that hardware exceptions can be safely caught
- right now, you have to manually move the body of a try/except or try/finally block that may catch hardware exceptions to a separate procedure/function declared with the "noinline" modifier
add support for generating debug info
add support for generating more meta-information for optimizations (e.g. range information about subrange types and enums)
pass on more FPC-related code generation options to LLVM (currently, mainly -CfXXX and -Ox get passed on)
add support for TLS-based threadvars
directly generate bitcode (.bc) instead of bitcode assembly (.ll) files. The reason is that the LLVM project attempts to ensure backward compatibility for bitcode files, but not bitcode assembly. FPC currently generates bitcode assembly files anyway because they're much easier to create and debug (in the sense of debugging the compiler's LLVM code generator).

Frequently Asked Questions

Will the FPC team, somewhere in the future, adopt the LLVM as the backend on all platforms?

No, for various reasons:

LLVM will almost certainly never support all targets that FPC supports (Gameboy Advance, OS/2, WinCE, ...), or at some point drop support for targets that FPC still supports (as already happened with Mac OS X for PowerPC/PowerPC64).
the native FPC code generators require very little maintenance once written, as they are quite well insulated via abstractions from the rest of the compiler, so there is no reason to drop them
FPC is a volunteer/hobby project, and several developers' main interest is working on native FPC code generators/optimisers
you still need some of the hardest parts of the FPC native code generators anyway for LLVM (entry/exit code handling, parameter manager) to deal with assembler routines, and because LLVM does not fully abstract parameter passing
a hardware architecture seldom changes in backward-compatibility breaking ways once released, while LLVM makes no such promises.
LLVM changes a lot, all the time. That means there is a high chance of introducing regressions.
FPC's native code generators are much faster than LLVM's (even if you would neglect the overhead of FPC generating bitcode and the LLVM tool chain reading it back in), so especially while developing it may be more interesting to use FPC's own code generators

Is it at all likely that an LLVM compiler would produce significantly better/faster optimizations than FPC as it stand currently?

It depends on the kind of code. The more pure maths (floating point or integer, especially in tight loops), the more likely it will be faster.
Artificial benchmarks will also be much faster.
For a typical database program, don't expect much change.
Example 1: the compiler itself on x86-64 is about 10% faster when compiled with LLVM on an Intel Haswell processor, or 18% if you also enable link-time optimization.
Example 2: A Viprinet benchmark compiled for ARMv7, running on an APM Mustang X-Gene board: 18% faster.

@@ Line 1: / Line 1: @@
-The current status of '''Low Level Virtual Machine''' (LLVM) is ''in progress''.
+The current status of '''Low Level Virtual Machine''' (LLVM) support is ''ready for general testing''.
-{{Note|The information in this section has been updated as of September 2016 and might be outdated by the time you read it}}
+{{Note|The information in this section has been updated as of October 2020 and might be outdated by the time you read it}}
 ==Progress==
-LLVM support has not yet fully landed in trunk, so it is not yet possible to use it.
+FPC with an LLVM code generator backend is available in svn trunk. It currently supports the following targets:
+* Darwin/x86-64
+* Darwin/AArch64 (macOS, untested on iOS)
+* Linux/x86-64
+* Linux/AArch64
+* Linux/ARMHF
-The main missing features before the result is usable for real world code are:
+==Usage==
-# LLVM has support for explicit setjmp/longjmp (which FPC uses on most platforms for exception handling), but we need to make use of some LLVM exception handling intrinsics to ensure the correctness of its control flow analysis <br />A future, better, way may be to solely use LLVM's built-in primitives for exception handling.
+===Install Clang===
-# possibly support for debug information generation<br />There are also a few LLVM limitations:
+You can use a version included in an LLVM release available from [http://releases.llvm.org/ the official LLVM site], or use a version that comes with Xcode (macOS) or your Linux distribution.
-#* LLVM has no support for arbitrary instructions throwing exceptions. I.e., segmentation faults, alignment exceptions, bus errors, floating point exceptions etc are not supported in any way by LLVM. If it can prove at compile time that a non-floating point exception will happen (e.g., you store nil in a pointer and immediately dereference it), it will simply interpret the exception-causing instruction as having "undefined behaviour", which generally results in pretty much all code depending on the result of that instruction getting "optimised" away. In case of floating point exceptions, LLVM will replace the result of the instruction with Inf/Nan at compile time. They are aware of this limitation (http://llvm.org/devmtg/2015-10/slides/KlecknerMajnemer-ExceptionHandling.pdf), but there is no one actively working on it right now (https://groups.google.com/forum/#!topic/llvm-dev/7yLycHmeydo )
-#* LLVM has no support for the i386 "register" calling convention, so the support for the i386 target using LLVM will probably never be added.
-As alluded to above, LLVM support needs to be added/tested/maintained separately for each supported architecture and to a lesser extent for each supported OS.
+FPC can generate LLVM code that can be compiled with LLVM 7.0 until at least 11.0.
+===Build FPC with LLVM support===
+Build FPC as usual, but add '''LLVM=1''' to the make command line, and
+* Specify the LLVM/Clang version you are using by adding the appropriate '''-Clv''' command line parameter to '''OPTNEW'''. E.g. '''OPTNEW="-Clv7.0"''' (Clang 7.0) or '''OPTNEW="-ClvXcode-9.3''' (the Clang that ships with Xcode 9.3). The latest supported versions currently are Clang 11.0 and Xcode 11.0, but it is quite possible that the generated code will be compatible with later versions. If clang accepts the generated code, it should work fine.
+{{Warning|If you specify any custom parameters through '''OPT''' when building FPC, then also add those in '''OPTNEW'''. The reason is that during the compilation of the ''packages'' directory only the contents of '''OPTNEW''' will be used (it's hard to change this because during cross-compiling there are special requirements to build the ''fpmake'' utility using the bootstrap compiler).}}
+* FPC uses clang to "assemble" the generated LLVM IR. If your clang binary has a custom suffix (as is common on many Linux distributions), you can use the '''-XlS<x>''' parameter to specify this suffix. E.g. '''-XlS-7''' in case the clang binary is called '''clang-7'''.
+* If you use a custom installed LLVM version, specify the path to its Clang binary using FPC's '''-FD''' command line option (add it to the make '''OPTNEW''' options). The compiler will also use this path to find the LTO library if needed (see later).
+** In this case, also add the path to the custom clang binary to your $HOME/.fpc.cfg file (so it will be found when you compile code with the compiler after the "make" command has finished), e.g. like this:
+  #include /etc/fpc.cfg
+  #ifdef CPULLVM
+  -FD/Users/Me/clang+llvm-8.0.0-x86_64-apple-darwin/bin
+  #endif
+* On Linux, also add the path to libgcc_s your '$HOME/.fpc.cfg' file. E.g. on Ubuntu 16.04: '''-Fl/usr/lib/gcc/x86_64-linux-gnu/5', similar to the above.
+===Installing FPC with LLVM support===
+FPC built with LLVM support does not include the built-in code generator. Additionally, installing it will currently overwrite any FPC with the same version number that's installed in the same prefix (target directory). As the units generated by FPC with the LLVM backend are not compatible with those used by FPC with the built-in code generator, it's better to install such a version in a different prefix (target directory) for now. Use the make parameter '''INSTALL_PREFIX=/xxx/yyy''' to specify this prefix. As above, you can use custom block in your '$HOME/.fpc.cfg' to specify the alternative unit directories:
+  #ifdef CPULLVM
+  -Fu/yourLLVMinstallPREFIX/lib/fpc/$fpcversion/units/$fpctarget/*
+  -Fu/yourLLVMinstallPREFIX/lib/fpc/$fpcversion/units/$fpctarget/rtl
+  #endif
+===Using Link-Time Optimisation (LTO)===
+Link-time optimisation means that potentially the entire program and all units that it uses are all optimised together.
+To compile units with LTO support, or to compile a program or library with LTO, add '''-Clflto''' on the compiler command line. If you add this to '''OPT'''/'''OPTNEW''' when building FPC, all standard units and the compiler itself will also be built with LTO.
+Note:
+* If you compile a unit with LTO, it will also be compiled normally. This means you can use it both for LTO and for normal (static or smart) linking afterwards.
+* The linker (ld) included with least Xcode 9 until and including 10.1 contain various bugs that cause errors when the system unit is included in the LTO. You can work around this by specifying the '''-Clfltonosystem''' command line option in additions to '''-Clflto'''.
+* On Linux, unless you are using your distribution's default LLVM version, you will also have to build the LLVMgold.so linker plugin and place it in the "lib" directory of your custom LLVM installation (it is not shipped as part of the official LLVM installers, because it needs to be built for the binutils you have on your system). See http://llvm.org/docs/GoldPlugin.html for more information.
+==Open tasks==
+* Only a few platforms are supported right now, but more can be added. Windows will be harder, because it requires support for generating SEH-style LLVM-based exception handling code. Other platforms should be "reasonably" easy (the parameter handling needs to be generalised more though).
+* add support for the (experimental) llvm.experimental.constrained.* intrinsics, to properly support floating point rounding modes and exceptions
+** This has been partly done, but due to their experimental nature not all of them supported on all target platforms
+* add support for automatically outlining try-blocks into "noinline" nested functions, so that hardware exceptions can be safely caught
+** right now, you have to manually move the body of a try/except or try/finally block that may catch hardware exceptions to a separate procedure/function declared with the "noinline" modifier
+* add support for generating debug info
+* add support for generating more meta-information for optimizations (e.g. range information about subrange types and enums)
+* pass on more FPC-related code generation options to LLVM (currently, mainly -CfXXX and -Ox get passed on)
+* add support for TLS-based threadvars
+* directly generate bitcode (.bc) instead of bitcode assembly (.ll) files. The reason is that the LLVM project attempts to ensure backward compatibility for bitcode files, but not bitcode assembly. FPC currently generates bitcode assembly files anyway because they're much easier to create and debug (in the sense of debugging the compiler's LLVM code generator).
 ==Frequently Asked Questions==
 ;Will the FPC team, somewhere in the future, adopt the LLVM as the backend on all platforms?:No, for various reasons:
-:* LLVM will almost certainly never support all targets that we support (Gameboy Advance, OS/2, WinCE, ...), or at some point drop support for targets that we still support (as already happened with Mac OS X for PowerPC/PowerPC64).
+:* LLVM will almost certainly never support all targets that FPC supports (Gameboy Advance, OS/2, WinCE, ...), or at some point drop support for targets that FPC still supports (as already happened with Mac OS X for PowerPC/PowerPC64).
-:* the native FPC code generators require very little maintenance once written, as they are quite well insulated via abstractions from the rest of the compiler
+:* the native FPC code generators require very little maintenance once written, as they are quite well insulated via abstractions from the rest of the compiler, so there is no reason to drop them
-:* you still need some of the hardest parts of the FPC native code generators anyway for LLVM (entry/exit code handling, parameter manager), to be able to deal with assembler routines and because LLVM does not fully abstract parameter passing
+:* FPC is a volunteer/hobby project, and several developers' main interest is working on native FPC code generators/optimisers
-:* a hardware architecture seldom changes in backward-compatibility breaking ways once released, while LLVM makes no such promises. They do seem to have finally settled more or less on the binary bitcode format (even there are no guarantees, but maybe I'll add support for that after all)
+:* you still need some of the hardest parts of the FPC native code generators anyway for LLVM (entry/exit code handling, parameter manager) to deal with assembler routines, and because LLVM does not fully abstract parameter passing
-:* LLVM changes a lot, all the time. That means a high chance of introducing regressions. I don't know how likely it would be that FPC-with-LLVM would one day be admissible to be run as part of LLVM's buildbots and automatic regression tests, but if not then it's possible that maintaining the LLVM backend may become more work than the regular code generators and optimizers combined (at least if we want to keep up with the latest LLVM versions, and not stick with a particular version for long times like most out-of-tree "consumers" of LLVM do)
+:* a hardware architecture seldom changes in backward-compatibility breaking ways once released, while LLVM makes no such promises.
-:* most OS-specific support is in the run time library, not in the compiler. As a result, LLVM will not save much time there
+:* LLVM changes a lot, all the time. That means there is a high chance of introducing regressions.
-:* our native code generators are much faster than LLVM's (even if you would neglect the overhead of FPC generating bitcode and the LLVM tool chain reading it back in), so especially while developing it may be more interesting to use our code generators
+:* FPC's native code generators are much faster than LLVM's (even if you would neglect the overhead of FPC generating bitcode and the LLVM tool chain reading it back in), so especially while developing it may be more interesting to use FPC's own code generators
-;Is it at all likely that an LLVM compiler would produce significantly better/faster optimizations than FPC as it stand currently?:It depends on the kind of code. The more pure maths (floating point or integer, especially in tight loops), the more likely it will be faster.
+;Is it at all likely that an LLVM compiler would produce significantly better/faster optimizations than FPC as it stand currently?
-:Artificial benchmarks will also be much faster.
+:* It depends on the kind of code. The more pure maths (floating point or integer, especially in tight loops), the more likely it will be faster.
-:For a typical database program, don't expect much change.
+:* Artificial benchmarks will also be much faster.
-:''actual performance comparison tests to be done''
+:* For a typical database program, don't expect much change.
+:* Example 1:  the compiler itself on x86-64 is about 10% faster when compiled with LLVM on an Intel Haswell processor, or 18% if you also enable link-time optimization.
+:* Example 2: A [https://lists.freepascal.org/pipermail/fpc-devel/2018-November/039865.html Viprinet benchmark] compiled for ARMv7, running on an APM Mustang X-Gene board: [https://lists.freepascal.org/pipermail/fpc-devel/2019-February/040440.html 18% faster].
 ==See Also==
 * [[FPC Roadmap]]
 [[Category:FPC]]
 [[Category:FPC internals]]

Difference between revisions of "LLVM"

Revision as of 15:24, 28 November 2020

Contents

Progress

Usage

Install Clang

Build FPC with LLVM support

Installing FPC with LLVM support

Using Link-Time Optimisation (LTO)

Open tasks

Frequently Asked Questions

See Also

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Tools

Search