Difference between revisions of "UTF8 Tools"

From Lazarus wiki
Jump to navigationJump to search
Line 14: Line 14:
 
** BOM or no BOM
 
** BOM or no BOM
 
      
 
      
Simple usage:
+
''Simple demo:''
    fCES := TCharEncStream.Create;
+
fCES := TCharEncStream.Create;
    fCES.LoadFromFile(OpenDialog1.FileName);
+
fCES.LoadFromFile(OpenDialog1.FileName);
    Memo1.text := fCES.UTF8Text;   
+
Memo1.text := fCES.UTF8Text;   
    fCES.free;
+
fCES.free;
  
 
* character.pas: Get Information about code points using the TCharacter class.
 
* character.pas: Get Information about code points using the TCharacter class.
 +
 +
''Demo''
 +
if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]);
  
 
* utf8scanner.pas: Access UTF-8 strings by code index, use case statements on UTF-8 strings and more...
 
* utf8scanner.pas: Access UTF-8 strings by code index, use case statements on UTF-8 strings and more...
 +
 +
''Index demo''
 +
s := TUTF8Scanner.Create(Memo1.text);
 +
for i := 1 to s.Length do
 +
if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]);
 +
Memo1.Text := s.UTF8String;
 +
s.free;
 +
 +
''Case demo''
 +
  s := TUTF8Scanner.Create(Memo1.text);
 +
  s.FindChars := 'öäü';
 +
  repeat
 +
    case s.FindIndex(s.Next) of
 +
  {ö} 0: s.Replace('oe');
 +
  {ä} 1: s.Replace('ae');
 +
  {ü} 2: s.Replace('ue');
 +
    end;
 +
  until s.Done;
 +
  Memo1.Text := s.UTF8String;
 +
  s.free;
  
 
== Download ==
 
== Download ==
 
[http://www.theo.ch/lazarus/utf8tools.zip Donwload utf8tools.zip]
 
[http://www.theo.ch/lazarus/utf8tools.zip Donwload utf8tools.zip]

Revision as of 12:17, 9 July 2009

About

Sharing some of my code


UTF-8 Tools

Purpose

Some tools for common problems with UTF-8 / Unicode.

  • charencstreams.pas: Load and save data from almost any text source like
    • ansi, UTF8, UTF16, UTF32
    • big or little endian
    • BOM or no BOM

Simple demo:

fCES := TCharEncStream.Create;
fCES.LoadFromFile(OpenDialog1.FileName);
Memo1.text := fCES.UTF8Text;  
fCES.free;
  • character.pas: Get Information about code points using the TCharacter class.

Demo

if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]);
  • utf8scanner.pas: Access UTF-8 strings by code index, use case statements on UTF-8 strings and more...

Index demo

s := TUTF8Scanner.Create(Memo1.text);
for i := 1 to s.Length do
if TCharacter.IsLetter(s[i]) then s[i] := TCharacter.toLower(s[i]);
Memo1.Text := s.UTF8String;
s.free;

Case demo

 s := TUTF8Scanner.Create(Memo1.text);
 s.FindChars := 'öäü';
 repeat
   case s.FindIndex(s.Next) of
 {ö} 0: s.Replace('oe');
 {ä} 1: s.Replace('ae');
 {ü} 2: s.Replace('ue');
   end;
 until s.Done;
 Memo1.Text := s.UTF8String;
 s.free; 

Download

Donwload utf8tools.zip