Whole Program Optimization

From Lazarus wiki
Revision as of 21:28, 8 December 2008 by Jonas (talk | contribs) (→‎How to use)
Jump to navigationJump to search

Overview

Traditionally, compilers optimise a program procedure by procedure, or at best compilation unit per compilation unit. Whole program optimisation (wpo) means that the compiler considers all compilation units that make up a program or library and optimises them using the combined knowledge of how they are used together in this particular case.

The way wpo generally works is as follows:

  • you compile the program normally, telling the compiler to store various bits of information into a feedback file
  • you recompile the program (and optionally all units that it uses) with wpo, providing the feedback file as extra input to the compiler

In some implementations, the compiler generates some kind of intermediary code (e.g., byte code) and the linker performs all wpo along with the translation to the target ISA. In case of FPC however, the scheme followed is the one described above.

wpo is currently only available in the fpc-wpo svn branch. This functionality will be merged to trunk in the foreseeable future.

General principles

A few general principles have been followed when designing the FPC implementation of wpo:

  • All information necessary to generate a wpo feedback file for a program is always stored in the ppu files. This means that you can e.g. use a generic RTL for wpo (even though the RTL itself will then not be optimised, your program and its units can be correctly optimised because the compiler knows everything it has to know about all RTL units);
  • The generated wpo feedback file is plain text. The idea is that it should be easy to inspect this file by hand, and to add information to it produced by external tools if desired (e.g., profile information);
  • The implementation of the wpo subsystem in the compiler is very modular, so it should be easy to plug in additional wpo information providers, or to choose at run time between different information providers for the same kind of information. At the same time, the interaction with the rest of the compiler is kept to a bare minimum to improve maintainability;
  • It is possible to generate a wpo feedback file while at the same time using another one as input. In some cases, using this second feedback file as input during a third compilation can further improve the results.

How to use

Generate WPO Feedback File

First of all, compile your program (or library) and all of its units as you would normally do, except that when compiling the main program/library you add -FW/path/to/feedbackfile.wpo -OW<selected_wpo_options>. The compiler will then, right after your program has been linked, collect all necessary information to perform the requested wpos during a successive compilation run, and store this information in /path/to/feedbackfile.wpo

==Use Generated WPO Feedback File

To actually apply the wpos, recompile the program/library and all or some of the units that it uses, using -Fw/path/to/feedbackfile.wpo -Ow<selected_wpo_options>, thereby pointing the compiler to the feedback file generated in the previous step. The compiler will then read the information collected about the program during the previous compiler run, and use it during the current compilation of units and/or program/library.

Units not recompiled during the second pass will obviously not be optimised, but they will still work correctly when used together with the optimised units and program/library.

When to use

Since whole program optimisation requires multiple compilations, it is advisable to only use this functionality when compiling a final release version.

Available whole program optimisations

All optimisations

Parameter

-OWall/-Owall

Effect

Enables all whole program optimisations described below.

Limitations

Not applicable.


Whole Program Devirtualization

Parameter

-OWdevirtcalls/-Owdevirtcalls

Effect

Changes virtual method calls into normal (static) method calls when the compiler can determine that a virtual method call will always go to the same static method. This makes such code both smaller and faster. In general, it's mainly an enabling optimisation for other optimisations, because it makes the program easier to analyse due to the fact that reduces indirect control flow.

Limitations

  • The current implementation is context-insensitive. This means that the compiler only looks at the program as a whole and determines for each class type which methods can be devirtualised, rather than that it looks at each call statement and the surrounding code to determine whether or not this call can be devirtualised;
  • The current implementation does not yet devirtualise interface method calls (not when calling them via an interface instance, nor when calling them via a class instance).


Optimise Virtual Method Tables

Parameter

-OWoptvmts/-Owoptvmts

Effect

This optimisations looks at which class types can be instantiated in a program, and based on this information it replaces virtual method table (VMT) entries that can never be called with references to FPC_ABSTRACT error. This means that such methods, unless they are called directly via an inherited call from a child class/object, can be removed by the linker. It has little or no effect on speed, but can help reducing code size.

Limitations

  • None known


Symbol liveness

Parameter

-OWsymbolliveness/-Owsymbolliveness

Effect

This parameter does not perform any optimisation by itself. It simply tells the compiler to record which functions/procedures were not removed by the linker in the final program. During a subsequent wpo pass, the compiler can then ignore the removed functions/procedures as far as wpo is concerned (e.g., if a particular class type is only constructed in one unused procedure, then ignoring this procedure can improve the effectiveness of the previous two optimisations).

Limitations

  • This optimisation requires that the nm utility is installed on the system. For Linux binaries, objdump will also work. In the future, this information could also be extracted from the internal linkers for the platforms that it supports.
  • Collecting information for this optimisation (using -OWsymbolliveness) requires that smart linking is enabled (-XX) and that symbol stripping is disabled (-Xs-).


Format of the wpo feedback file

This information is mainly interesting if you want to add external data to the wpo feedback file, e.g. from a profiling tool. If you are just a user of the wpo functionality, you can ignore what follows.

The file consists of comments and a number of sections. Comments are lines that start with a #. Each section starts with "% " followed by the name of the section (e.g.,% contextinsensitive_devirtualization). After that, until either the end of the file or until the next line starting with with "% ", first a human readable description follows of the format of this section (in comments), and then the contents of the section itself.

There are no rules for how the contents of a section should look, except that lines start with # are reserved for comments and lines starting with & are reserved for section markers..