CSV

From Lazarus wiki
Revision as of 12:29, 16 November 2012 by BigChimp (talk | contribs) (SDFDataset: RFC4180 to be used (can't get Note and Warning layout correct, updates welcome)

Overview

CSV means Comma-Separated Values, and is a popular file format that is unfortunately not completely standardized. It is a text based file format with data fields separated by comma's (or, in variants, other characters like tabs and semicolons). There may or not be a header line that lists the field names. Field data containing delimiters may be prohibited or enclosed by quotes (most commonly the double quote character). Line endings (#13 and/or #10) may or may not be allowed in field data.

RFC4180 (see [www.rfc-editor.org/rfc/rfc4180.txt]) tries to codify and standardize existing practice; it makes sense to conform to this standard when writing CSV data (and accept all RFC4180 data when reading). Another, different, specification can be found at [1]

A sample CSV snippet:

FirstName,Surname,DOB,Remarks
Jim,Weston,19560818,"Also known as ""The Butcher"""
Alice,Cooper,19760312,""

Apparently a header line is used here, as are quoting using double quotes.

Spreadsheet packages such as Microsoft Excel and OpenOffice/LibreOFfice Calc are able to export to and import from this format. However, as Microsoft Excel may interpret some fields such as date fields differently depending on a user's OS locale, it may pay to find alternative ways of transferring data (e.g. using the FPSpreadsheet code).

CSV and SDF

Delphi (and FreePascal) have a very similar format, SDF. See SDF for more details.

Implementations

DelimitedText

TStringList offers the DelimitedText property. This parses a line of text out into separate fields. Note, however, that DelimitedText is supposed to be in SDF, a Delphi-specific format that is very much like CSV, but does not conform with RFC4180 completely.

Tips: when reading CSV data, set the StrictDelimiter property to true.

When writing out CSV data, set StrictDelimiter to false and output the DelimitedText property. One oddity is that e.g. tab characters are removed when writing out data using StrictDelimiter:=false

SDF format

See SDF#SDF format

SDFDataset

FreePascal offers the SDFDataset, which stores data in SDF format.

As indicated, SDF differs from CSV. Depending on the flavour of CSV, this format may be close enough to what a reading application expects to function.

Warning: SDFDataset will likely not work at least on ARM-based Windows CE/Windows mobile, see http://bugs.freepascal.org/view.php?id=17871

Note: Nov 2012: SDFDataset will be (but has not yet been) redefined to use RFC4180 CSV format. See e.g. http://bugs.freepascal.org/view.php?id=22980

Data Export

FreePascal/Lazarus database export functionality (e.g. TCSVExporter on the Data Export tab) offers CSV export functionality for datasets.

CsvDocument

See CsvDocument.

Jan's CSV components

See JCSV_(Jans_CSV_Components).

ZMSQL

ZMSQL stores data in semicolon-delimited files (using SDF?).