Difference between revisions of "Character and string types"

From Lazarus wiki
Jump to navigationJump to search
(→‎WideChar: utf16 cannot encode all unicode code points in 1 2byte unit)
Line 15: Line 15:
 
== WideChar ==
 
== WideChar ==
  
A variable of type '''WideChar''' is exactly 2 bytes in size, and contains one [[LCL Unicode Support|Unicode]] character in UTF-16 encoding.
+
A variable of type '''WideChar''' is exactly 2 bytes in size, and contains one (part of) [[LCL Unicode Support|Unicode]] character in UTF-16 encoding.
 +
Note: it is impossible to encode all Unicode code points in 2 bytes. Therefore, 2 WideChars may be needed to encode a single code point.
  
 
{| class="wikitable" style="text-align:center; width:50px"
 
{| class="wikitable" style="text-align:center; width:50px"
Line 24: Line 25:
 
==== Reference ====
 
==== Reference ====
 
* [http://www.freepascal.org/docs-html/ref/refsu8.html FPC WideChar documentation]
 
* [http://www.freepascal.org/docs-html/ref/refsu8.html FPC WideChar documentation]
 +
* [https://en.wikipedia.org/wiki/UTF-16]
  
 
== PChar ==
 
== PChar ==

Revision as of 16:31, 28 December 2013

Free Pascal supports several types of characters and strings.

AnsiChar

A variable of type AnsiChar is exactly 1 byte in size, and contains one ASCII character.

a

Reference

WideChar

A variable of type WideChar is exactly 2 bytes in size, and contains one (part of) Unicode character in UTF-16 encoding. Note: it is impossible to encode all Unicode code points in 2 bytes. Therefore, 2 WideChars may be needed to encode a single code point.

a

Reference

PChar

A variable of type PChar is basically a pointer to a Char type, but allows additional operations. PChars can be used to access C-style null-terminated strings, e.g. in interaction with certain OS libraries or third-party software.

a b c #0
^

Reference

PWideChar

A variable of type PWideChar is a pointer to a WideChar variable.

a b c #0 #0
^

Reference

String

The type string may refer to ShortString or AnsiString, depending from the {$H} switch. If the switch is off ({$H-}) then any string declaration will define a ShortString. If it is on ({$H+}) string without length specifier will define an AnsiString, otherwise a ShortString with specified length.