The register allocator

From Lazarus wiki
Revision as of 01:09, 18 December 2013 by DanielS (talk | contribs)
Jump to navigationJump to search

Registry allocation in FPC

Introduction

Main unit for the registry allocation is rgobj.pas
Main class for the registry allocation is TRgObj located in rgobj.pas

Registry allocator provides imaginary registers for the assembler instructions during the code generation. Then it calculates the real registers to replace the imaginary ones.

The fpc registry allocator uses the Registry coloring for determining the real registers. Also uses registry spilling technique - when there is not enough registers it uses the memory.


How to use the registry allocator

This topic describes how to use the registry allocator during the code generation.
Described like a black box with the public methods you can call to get job done.

Creating the Registry allocator

Creating registry allocator is the first step we make before we can use its functionality.

The low level code generator creates several instances of the TRgObj class. Normally this happens in:

tcg.init_register_allocators

And the instances are normally freed in:

tcg.done_register_allocators

Each TRgObj instance allocates registers of a certain type. For example, one register type is Integer registers. Another one is floating point (FPU) registers. That's why we have a few TRgObj instances, one for every type of register that the cpu supports.

They are created when the code generation of specific routine begins. Code generator works on subroutine level. The registry allocator also allocates registers for specific method, procedure, function.

Using registers in the code generation

To allocate a register use the following method:

TRgObj.getRegister

for the appropriate register allocator instance (depending on the register type) to get register for some assembler instruction. This way an imaginary register is allocated and can be used after that in a specific assembler instruction.

The code generator calls it in the following functions:

tcg.GetIntRegister
tcg.GetAddressRegister
tcg.GetFPURegister
tcg.GetMMRegister


Additionally, the following methods:

TRgObj.getCpuRegister
TRgObj.ungetCpuRegister
TRgObj.allocCpuRegisters
TRgObj.deallocCpuRegisters

allocate or free one or multiple real registers (not imaginary ones!) This is usually used for instructions that require a specific register (for example SHL/SHR/SAR on x86, which always uses ECX). This is also used when doing a function call to indicate that certain registers (which depend on the calling convention) may be destroyed by the called function.

After allocating the register we can use it in some assembler instructions. For every instruction that generates, the code generator notifies the registry allocator. It passes the instruction, also the imaginary register as parameters to the following method.

         TRgObj.add_reg_instruction(instr, r, cg.executionweight);

But for MOV instruction there is specific method that is used

         TRgObj.add_move_instruction(instr:Taicpu);

Also weight is provided to determine which register should be spilled and which is better to use real cpu register.

Generating of real registers

At the end when all the assembler instructions are generated we call

       do_register_allocation(list: TAsmList; headerTai: TAi)

It calculates real registers for the imaginary ones.


Tips

You can compile your project by using the -sr switch. This will leave the immaginary register names in the generated .s file

Calls hierarchy for the public methods

Public


constructor

destructor

do_register_allocation - level 1 is ordered by calling
  insert_regalloc_info_all
  generate_interference_graph
    add_edges_used(1, 2)			get_alias						add_edge
      add_edge												ibitmap.s
  prepare_colouring
    make_work_list				ri_coalesced
      sort_simplify_worklist
  colour_registers
    simplify					ri_coalesced
      decrement_degree				ri_coalesced
    coalesce					get_alias		simplifyworklist.add(v);	ibitmap
      conservative				ri_coalesced
      adjacent_ok				ri_coalesced						ibitmap
      add_worklist							simplifyworklist.add(u);
      combine												ibitmap, add_edge
        enable_moves
        decrement_degree			ri_coalesced		simplifyworklist.add(m)
        add_edge											ibitmap.s
    freeze
      freeze_moves(, 2)				get_alias(3)
    select_spill
      freeze_moves(, 2)				get_alias(3)
    assign_colours(, 2)				get_alias
  epilogue_colouring		Destroys the objects used during the coloring - worklist_moves, active_moves, frozen_moves, coalesced_moves, constrained_moves, reginfo.movelist
  spill_registers
    clear_interferences											ibitmap.s
    instr_spill_register			get_alias
      getregisterinline
        add_edges_used(2, 2)			get_alias
	  add_edge											ibitmap.s
      ungetregisterinline
      get_spill_subreg
      do_spill_replace
      do_spill_read
      do_spill_written
  translate_registers
    assign_colours(, 2)				get_alias

getregister

add_move_instruction
  add_to_movelist

combine						ri_coalesced(s)
  enable_moves
  decrement_degree				ri_coalesced

-----------------------------------------
Properties

live_range_direction
  set_live_range_direction

live_start
  get_live_start
  set_live_start

live_end
  get_live_end
  set_live_end

Class TRgObj

Description

Main class for registry allocation in FPC

Public methods

Protected methods

Private methods

Imaginary registers

Storing imaginary registers

TRgObj stores the imaginary registers in list of TRegInfo structure.

Life of imaginary register

Life of imaginary register starts when it is first used by assembler instruction.
When it first appears in an assembler instruction in the assembler list for the specific routine.
And ends when it is last used in an assembler instruction. In the TRegInfo structure for the specific imaginary register there are 2 fields for the life of the register.
live_start: TAi; indicates which is the assembler instruction where the register is first used.
live_end: TAi; indicates which is the assembler instruction where the register is last used.


Backgrounds

Grouping the instructions

More abstract look at the CPU instructions shows us that we have mainly few kinds of instructions. - The first group contains instructions that changes some data. This data remains on the same place but it is changed. Also do not need extra data ot perform the change Some instructions like these are inc, dec, CLC, CLD

- Second group are instructions that move date from one place to another. Such as MOV, LAHF, POP, PUSH, LEA

- Third group of instructions are those to compare data. They do not change the data. just check it. Such instructions are TEST, COMPARE

- Fourth group of instructions are those who change the flow of the execution of the instructions such as: JMP, Jcc, JCXZ,

- Fifth group are instructions that changes data based on other data like ADD,

Another grouping of instructions could be by the cpu version where the instruction first appear.

Data in the instructions

Alsmost all of the instructions operate on some data. Also uses some data to perform operation. The data can be in the cpu registers, memory, cpu flags.

Programming using the instructions

The instructions are independant one from other. However when a program is created there are some logical dependencies that programmer introduce. Programmer wants a few modifications to be performed on some data. Which can be performed by few instructions executed one after another. And at the end the needed data will be produced. But data is not exactly cpu register, or memory address. Data is something the the programmer cares about. For example we have some starting data that equals 1000. we can put it in cpu register we want to add to it 200. we can use asm instruction add but then we can move the value that we reached in another register. And then to increment it by 300 at the end we will have the needed value 1500. But we used 2 registers for one data. Normally we don't have to use 2 registers. We can do it using one. Cause we have to move the data to second register somewhere in the middle. And this is extra instruction. However more than one data is parallel processed. And sometimes we can use this extra move. Just to speed up executions on other data. This is normally needed when to datas are produced that will calculate third one.

This is not very important for the registry allocator. It is more dedicated to the code generator. Code generator could decide where to add extra instruction, to speed other executions. It look like registry allocator should not care about this. But it's good to thing about it.

Object Design guidelines

In the current implementation the getting of the registers and the allocation are mixed togather. Also they share same structures, same class. But the data for each of them is used on the completely different phases. Mixing them togather brings more code on one place. Also its getting difficult to separate them when you work on them. They can be easily devided without lost of speed. Also different types of registry allocations can be used. The getting of the registers and setting the weights, also life ... can be on one place. But at the end different allocators can process this data. It is difficult for further develop the registry allocation cause many phases and logic was brought together on one place.