|
|
|
|
|
Description |
The CSV (comma-separated value) format is defined by RFC 4180,
"Common Format and MIME Type for Comma-Separated Values (CSV) Files",
http://www.rfc-editor.org/rfc/rfc4180.txt
This lazy parser can report all CSV formatting errors, whilst also
returning all the valid data, so the user can choose whether to
continue, to show warnings, or to halt on error.
Valid fields retain information about their original location in the
input, so a secondary parser from textual fields to typed values
can give intelligent error messages.
In a valid CSV file, all rows must have the same number of columns.
This parser will flag a row with the wrong number of columns as a error.
(But the error type contains the actual data, so the user can recover
it if desired.) Completely blank lines are also treated as errors,
and again the user is free either to filter these out or convert them
to a row of actual null fields.
|
|
Synopsis |
|
|
|
|
CSV types
|
|
|
A CSV table is a sequence of rows. All rows have the same number
of fields.
|
|
|
A CSV row is just a sequence of fields.
|
|
|
A CSV field's content is stored with its logical row and column number,
as well as its textual extent. This information is necessary if you
want to generate good error messages in a secondary parsing stage,
should you choose to convert the textual fields to typed data values.
| Constructors | CSVField | | csvRowNum :: !Int | | csvColNum :: !Int | | csvTextStart :: !(Int, Int) | | csvTextEnd :: !(Int, Int) | | csvFieldContent :: !String | | csvFieldQuoted :: !Bool | |
| CSVFieldError | | csvRowNum :: !Int | | csvColNum :: !Int | | csvTextStart :: !(Int, Int) | | csvTextEnd :: !(Int, Int) | | csvFieldError :: !String | |
|
| Instances | |
|
|
CSV parsing
|
|
|
A structured error type for CSV formatting mistakes.
| Constructors | IncorrectRow | | csvRow :: !Int | | csvColsExpected :: !Int | | csvColsActual :: !Int | | csvFields :: [CSVField] | |
| BlankLine | | csvRow :: !Int | | csvColsExpected :: !Int | | csvColsActual :: !Int | | csvField :: CSVField | |
| FieldError | | | NoData | |
| Instances | |
|
|
|
The result of parsing a CSV input is a mixed collection of errors
and valid rows. This way of representing things is crucial to the
ability to parse lazily whilst still catching format errors.
|
|
|
Extract just the errors from a CSV parse.
|
|
|
Extract just the valid portions of a CSV parse.
|
|
|
A first-stage parser for CSV (comma-separated values) data.
The individual fields remain as text, but errors in CSV formatting
are reported. Errors (containing unrecognisable rows/fields) are
interspersed with the valid rows/fields.
|
|
|
Sometimes CSV is not comma-separated, but delimiter-separated
values (DSV). The choice of delimiter is arbitrary, but semi-colon
is common in locales where comma is used as a decimal point, and tab
is also common. The Boolean argument is
whether newlines should be accepted within quoted fields. The CSV RFC
says newlines can occur in quotes, but other DSV formats might say
otherwise. You can often get better error messages if newlines are
disallowed.
|
|
Pretty-printing
|
|
|
Some pretty-printing for structured CSV errors.
|
|
|
Pretty-printing for CSV fields, shows positional information in addition
to the textual content.
|
|
|
Turn a full CSV table back into text, as much like the original
input as possible, e.g. preserving quoted/unquoted format of fields.
|
|
|
Turn a full CSV table back into text, using the given delimiter
character. Quoted/unquoted formatting of the original is preserved.
|
|
Conversion between standard and simple representations
|
|
|
Convert a CSV table to a simpler representation, by dropping all
the original location information.
|
|
|
Convert a simple list of lists into a CSVTable by the addition of
logical locations. (Textual locations are not so useful.)
Rows of varying lengths generate errors. Fields that need
quotation marks are automatically marked as such.
|
|
Selection, validation, and algebra of CSV tables
|
|
|
Select and/or re-arrange columns from a CSV table, based on names in the
header row of the table. The original header row is re-arranged too.
The result is either a list of column names that were not present, or
the (possibly re-arranged) sub-table.
|
|
|
Validate that the named columns of a table have exactly the names and
ordering given in the argument.
|
|
|
A generator for a new CSV column, of arbitrary length.
The result can be joined to an existing table if desired.
|
|
|
A join operator, adds the columns of two tables together.
Precondition: the tables have the same number of rows.
|
|
Produced by Haddock version 2.4.2 |