Difference between revisions of "m68k"

From Lazarus wiki
Jump to navigationJump to search
m (some remarks)
(→‎Unaligned access support: mention PACKRECORDS explicitly, the 68008, reword a few bits.)
(31 intermediate revisions by 3 users not shown)
Line 9: Line 9:
 
The current Motorola 680x0 backend supports the following CPU and FPU types:
 
The current Motorola 680x0 backend supports the following CPU and FPU types:
  
* Motorola 68000 (including code generation for unaligned accesses)
+
* Motorola 68000-68010 (including code generation for [[#Unaligned access support|unaligned accesses]])
* Motorola 68020-68040
+
* Motorola 68020-68060
 
* Motorola 68881 FPU and compatibles (optional)
 
* Motorola 68881 FPU and compatibles (optional)
  
Line 26: Line 26:
 
* [[Amiga]]
 
* [[Amiga]]
 
* [[Portal:Linux|Linux]]
 
* [[Portal:Linux|Linux]]
* [[Atari|Atari TOS]] (initial support only)
+
* [[Atari|Atari TOS]] (initial support only, OS bindings incomplete)
 +
* [[NetBSD]] (ELF based versions only)
 +
* [[PalmOS port|PalmOS]] and compatibles (experimental, cross-compiler target only)
 +
* [[Sinclair QL]] and compatibles (experimental, cross-compiler target only)
  
 
Support is planned for the following targets:
 
Support is planned for the following targets:
  
* [[Embedded]]
 
 
* [[Target MacOS|MacOS]]
 
* [[Target MacOS|MacOS]]
* [[PalmOS Port|PalmOS]] and compatibles
+
* [[Embedded]]
 +
 
 +
==Defaults==
 +
 
 +
The default target for a cross-compiler is [[Portal:Linux|Linux]], when running natively it's the host platform. In either case, any supported target is available, using the -T argument, the cross-compiler binary is only CPU specific, not OS specific. The default CPU target settings are 68020+68881 for [[Portal:Linux|Linux]] and [[NetBSD]], 68000+SoftFPU for [[PalmOS port|PalmOS]] and the [[Sinclair QL]], and 68020+SoftFPU for all other supported OS targets.
  
 
==Performance==
 
==Performance==
  
Most members of the old Motorola 680x0 CPU family are not fast enough to run the compiler natively, although high-end 68060 systems and various fast JIT-based m68k emulators with enough memory are capable to do so. Therefore it is recommended to use a cross-compiler when targeting most m68k systems. Apart from the performance issues the code generator is mature enough to compile the compiler itself, make cycle works. Only [[Amiga]] and [[Portal:Linux|Linux]] are supported as compilation hosts.
+
Most members of the old Motorola 680x0 CPU family are not fast enough to run the compiler natively, although high-end 68060 systems and various fast JIT-based m68k emulators with enough memory are capable to do so. Therefore it is recommended to use a cross-compiler when targeting most m68k systems. Apart from the performance issues the code generator is mature enough to compile the compiler itself, make cycle works. [[Amiga]], [[Portal:Linux|Linux]] and [[NetBSD]] are supported as compilation hosts.
  
 
==Registers==
 
==Registers==
  
When compiling for the Motorola 68k, Free Pascal uses the following conventions with registers. This matches, or closely matches most C compilers and most operating systems on this platform.
+
When compiling for the Motorola 68k, Free Pascal uses the following conventions with registers. This matches, or closely matches most C compilers and most operating systems on this platform. Any additional differences are documented at the specific platform's documentation.
  
 
* '''d0-d1''' : scratch data registers
 
* '''d0-d1''' : scratch data registers
Line 62: Line 68:
 
The Motorola 68k CPU target supports several different calling conventions.
 
The Motorola 68k CPU target supports several different calling conventions.
  
* '''stdcall''': this calling convention is entirely stack based. It passes all parameters on the stack from right to left.
+
* '''stdcall''': this calling convention is entirely stack based. It passes all parameters on the stack from left to right.
* '''register''': this calling convention uses the scratch registers to pass arguments. Registers '''d0''' and '''d1''' are used to pass up to two ordinal values, '''a0''' and '''a1''' are used to pass up to two pointers, addresses, or references. When compiling with FPU enabled, '''fp0''' and '''fp1''' are used to pass up to two floating point values. The values on the registers are passed from left to right. The remaining arguments are passed on the stack from right to left, like with the '''stdcall''' convention.
+
* '''register''': this calling convention uses the scratch registers to pass arguments. Registers '''d0''' and '''d1''' are used to pass up to two ordinal values, '''a0''' and '''a1''' are used to pass up to two pointers, addresses, or references. When compiling with FPU enabled, '''fp0''' and '''fp1''' are used to pass up to two floating point values. The values on the registers are passed from left to right. The remaining arguments are passed on the stack from left to right, like with the '''stdcall''' convention.
* '''syscall''': the syscall convention is operating system specific. It's supported on [[Amiga]] and [[Atari]], and it's documented at their respective pages.
+
* '''syscall''': the syscall convention is operating system specific. It's supported on [[Amiga]], [[Atari]], and [[PalmOS port|PalmOS]], and it's documented at their respective pages.
 +
 
 +
The current default calling convention is '''register'''.
 +
 
 +
===Return values===
 +
 
 +
* 32 bit or smaller ordinal values are returned in register '''d0'''
 +
* Pointers are returned in register '''a0''' or '''d0''' (platform and calling convention specific)
 +
* 64 bit ordinal values are returned in register pair '''d0/d1'''
 +
 
 +
==Unaligned access support==
 +
 
 +
Some members of the m68k CPU family doesn't support larger than byte accesses of odd addresses. Most notably this includes the original 68000, the 68008, the 68010 and some members of the CPU32 subfamily. The compiler supports generating special unaligned access code for these scenarios, when targeting these CPU. However, this unaligned access code is slow, as it involves several individual byte accesses plus shifts, so it should be avoided whenever possible. Unaligned access code will be generated at:
 +
* all locations marked as '''unaligned''' (with the unaligned keyword)
 +
* all code accessing fields of '''packed record'''s
 +
* all records which were declared after a '''{$PACKRECORDS 1}''' directive (same as using a '''packed record''')
 +
Because of the last two points, it's discouraged to use packed records on these CPU in performance critical code.
 +
 
 +
==Round/Trunc fast path and special handling==
 +
 
 +
The m68k family, and its associated FPU has no floating point to 64bit integer rounding instruction, which is mandatory to fully implement the System unit's Round and Trunc functions as inline nodes. Nevertheless, the compiler contains a fast path to avoid the slow Round/Trunc helper functions, when rounding towards 8, 16, and 32bit signed integers, 8, and 16bit unsigned integers. This fast path is only enabled with optimization level '''-O4''', or with '''-OoFASTMATH''' optimization. It's only enabled as uncertain optimization, because if the result of the rounding doesn't fit into the target variable it will be undefined, while it's defined with Round/Trunc, and should just contain the lower bits of the otherwise 64bit result. Please also note the Round/Trunc fast path needs a hardware FPU and it's not available with the SoftFPU.
 +
 
 +
==Stack==
 +
 
 +
Due to performance reasons caused by addressing modes constraints, it's highly advised to not use more than 32KiB stack in any function on an m68k CPU. The code generator backend supports larger than 32KiB stacks for single functions, but this will be always slower, and wasn't tested extensively. Also please note that most m68k systems are low-end with limited amount of memory, and usually with a fixed, limited size stack, where using large amounts of stack can be problematic.
 +
 
 +
==Compile time detection of CPU type==
 +
 
 +
To detect the current CPU and FPU targets during compile time, the following defines exist:
 +
 
 +
* '''CPU68''', '''CPU68K''', or '''CPUM68K''' identifies the m68k target, including ColdFire
 +
* '''CPU68000''', '''CPU68020''', '''CPU68040''', etc. identifies a specific m68k CPU, excluding ColdFire
 +
* '''CPUCOLDFIRE''' identifies a ColdFire
 +
* '''CPUISAA''', '''CPUISAB''', '''CPUISAC''', etc. identifies a specific ColdFire ISA version
 +
 
 +
* '''FPUNONE''' identifies no FPU
 +
* '''FPUSOFT''' identifies the RTL's SoftFPU
 +
* '''FPU68881''' identifies a 6888x or user code compatible FPU, including the internal FPU of the 68040 and 68060
 +
* '''FPUCOLDFIRE''' identifies the ColdFire FPU, as included in the ColdFire v4e core
 +
 
 +
==Runtime detection of CPU type==
 +
 
 +
The RTL provides two m68k specific variables, to allow the user code to detect the CPU easily, in case it is necessary. These are:
 +
 
 +
* '''Test68000''', which is similar to '''Test8086''' on x86. It returns values from 0-6, excluding 5, representing CPU types from 68000 to 68060.
 +
* '''Test68881''', which is similar to '''Test8087''' on x86. It returns values from 0-6. excluding 5. 0 represents no FPU. 1 and 2 are 6888x FPU. 4 and 6 are 68040 and 68060 internal FPU.
 +
 
 +
Example:
 +
 
 +
<syntaxhighlight lang="pascal">program hello68k;
 +
 
 +
begin
 +
  writeln('Hello World!');
 +
  write('Running on 680',Test68000,'0 CPU');
 +
  case Test68881 of
 +
    0: writeln(' without FPU');
 +
    1,2: writeln(' with 6888',Test68881,' FPU');
 +
  else
 +
    writeln(' with internal FPU');
 +
  end;
 +
end.
 +
</syntaxhighlight>
 +
 
 +
Notes:
 +
* this feature is currently only available on [[Amiga]]
 +
* this feature is currently not available on ColdFire CPUs
 +
 
 +
[[Category:Processors]]

Revision as of 09:48, 17 May 2021

The contents of this article reflect the status in FPC SVN trunk. Some features listed here might not be available in the latest stable version.

Introduction

Historically, the Motorola 680x0 (abbreviated as "m68k") port was the first Free Pascal port to CPU architectures other than i386. During the big refactoring for FPC 2.0, the code left without maintenance and the support broke and was partly removed. The Motorola 680x0 code generator got revived more than 10 years later, before FPC 3.0. It still shares some components with the old 1.x source code, like parts of the inline assembler reader, but most of the code is new.

Supported CPU types

The current Motorola 680x0 backend supports the following CPU and FPU types:

  • Motorola 68000-68010 (including code generation for unaligned accesses)
  • Motorola 68020-68060
  • Motorola 68881 FPU and compatibles (optional)
  • ColdFire ISA A to ISA C
  • ColdFire v4e FPU (optional)

Notes:

  • FPU support is optional, the code generator also supports using the RTL's SoftFPU.
  • Most of the ColdFire support was never tested on a real hardware, only in QEMU

Supported Targets

Free Pascal currently supports the following operating systems as target platforms on m68k:

  • Amiga
  • Linux
  • Atari TOS (initial support only, OS bindings incomplete)
  • NetBSD (ELF based versions only)
  • PalmOS and compatibles (experimental, cross-compiler target only)
  • Sinclair QL and compatibles (experimental, cross-compiler target only)

Support is planned for the following targets:

Defaults

The default target for a cross-compiler is Linux, when running natively it's the host platform. In either case, any supported target is available, using the -T argument, the cross-compiler binary is only CPU specific, not OS specific. The default CPU target settings are 68020+68881 for Linux and NetBSD, 68000+SoftFPU for PalmOS and the Sinclair QL, and 68020+SoftFPU for all other supported OS targets.

Performance

Most members of the old Motorola 680x0 CPU family are not fast enough to run the compiler natively, although high-end 68060 systems and various fast JIT-based m68k emulators with enough memory are capable to do so. Therefore it is recommended to use a cross-compiler when targeting most m68k systems. Apart from the performance issues the code generator is mature enough to compile the compiler itself, make cycle works. Amiga, Linux and NetBSD are supported as compilation hosts.

Registers

When compiling for the Motorola 68k, Free Pascal uses the following conventions with registers. This matches, or closely matches most C compilers and most operating systems on this platform. Any additional differences are documented at the specific platform's documentation.

  • d0-d1 : scratch data registers
  • d2-d7 : non-volatile data registers
  • a0-a1 : scratch address registers
  • a2-a4 : non-volatile address registers
  • a5 : frame pointer on Amiga, non-volatile address register otherwise
  • a6 : frame pointer on other systems, non-volatile address register on Amiga
  • a7 : stack pointer
  • fp0-fp1 : scratch floating point registers
  • fp2-fp7 : non-volatile floating point registers

Inline assembler

The inline assembler follows the Motorola syntax. The register alias fp points to a5 or a6 register, respectively. The register alias sp points to a7.

Calling Conventions

The Motorola 68k CPU target supports several different calling conventions.

  • stdcall: this calling convention is entirely stack based. It passes all parameters on the stack from left to right.
  • register: this calling convention uses the scratch registers to pass arguments. Registers d0 and d1 are used to pass up to two ordinal values, a0 and a1 are used to pass up to two pointers, addresses, or references. When compiling with FPU enabled, fp0 and fp1 are used to pass up to two floating point values. The values on the registers are passed from left to right. The remaining arguments are passed on the stack from left to right, like with the stdcall convention.
  • syscall: the syscall convention is operating system specific. It's supported on Amiga, Atari, and PalmOS, and it's documented at their respective pages.

The current default calling convention is register.

Return values

  • 32 bit or smaller ordinal values are returned in register d0
  • Pointers are returned in register a0 or d0 (platform and calling convention specific)
  • 64 bit ordinal values are returned in register pair d0/d1

Unaligned access support

Some members of the m68k CPU family doesn't support larger than byte accesses of odd addresses. Most notably this includes the original 68000, the 68008, the 68010 and some members of the CPU32 subfamily. The compiler supports generating special unaligned access code for these scenarios, when targeting these CPU. However, this unaligned access code is slow, as it involves several individual byte accesses plus shifts, so it should be avoided whenever possible. Unaligned access code will be generated at:

  • all locations marked as unaligned (with the unaligned keyword)
  • all code accessing fields of packed records
  • all records which were declared after a {$PACKRECORDS 1} directive (same as using a packed record)

Because of the last two points, it's discouraged to use packed records on these CPU in performance critical code.

Round/Trunc fast path and special handling

The m68k family, and its associated FPU has no floating point to 64bit integer rounding instruction, which is mandatory to fully implement the System unit's Round and Trunc functions as inline nodes. Nevertheless, the compiler contains a fast path to avoid the slow Round/Trunc helper functions, when rounding towards 8, 16, and 32bit signed integers, 8, and 16bit unsigned integers. This fast path is only enabled with optimization level -O4, or with -OoFASTMATH optimization. It's only enabled as uncertain optimization, because if the result of the rounding doesn't fit into the target variable it will be undefined, while it's defined with Round/Trunc, and should just contain the lower bits of the otherwise 64bit result. Please also note the Round/Trunc fast path needs a hardware FPU and it's not available with the SoftFPU.

Stack

Due to performance reasons caused by addressing modes constraints, it's highly advised to not use more than 32KiB stack in any function on an m68k CPU. The code generator backend supports larger than 32KiB stacks for single functions, but this will be always slower, and wasn't tested extensively. Also please note that most m68k systems are low-end with limited amount of memory, and usually with a fixed, limited size stack, where using large amounts of stack can be problematic.

Compile time detection of CPU type

To detect the current CPU and FPU targets during compile time, the following defines exist:

  • CPU68, CPU68K, or CPUM68K identifies the m68k target, including ColdFire
  • CPU68000, CPU68020, CPU68040, etc. identifies a specific m68k CPU, excluding ColdFire
  • CPUCOLDFIRE identifies a ColdFire
  • CPUISAA, CPUISAB, CPUISAC, etc. identifies a specific ColdFire ISA version
  • FPUNONE identifies no FPU
  • FPUSOFT identifies the RTL's SoftFPU
  • FPU68881 identifies a 6888x or user code compatible FPU, including the internal FPU of the 68040 and 68060
  • FPUCOLDFIRE identifies the ColdFire FPU, as included in the ColdFire v4e core

Runtime detection of CPU type

The RTL provides two m68k specific variables, to allow the user code to detect the CPU easily, in case it is necessary. These are:

  • Test68000, which is similar to Test8086 on x86. It returns values from 0-6, excluding 5, representing CPU types from 68000 to 68060.
  • Test68881, which is similar to Test8087 on x86. It returns values from 0-6. excluding 5. 0 represents no FPU. 1 and 2 are 6888x FPU. 4 and 6 are 68040 and 68060 internal FPU.

Example:

program hello68k;

begin
  writeln('Hello World!');
  write('Running on 680',Test68000,'0 CPU');
  case Test68881 of
    0: writeln(' without FPU');
    1,2: writeln(' with 6888',Test68881,' FPU');
  else
    writeln(' with internal FPU');
  end;
end.

Notes:

  • this feature is currently only available on Amiga
  • this feature is currently not available on ColdFire CPUs