CSV Built-ins ¶

CSV Built-ins provide functionality for parsing CSV files and accessing content. They provide three parsing methods: filePath, charset, and csvFormat.

Syntax

// Parses CSV file using given charset and csv format
<parsedResult CsvParser> := Csv.parse(<filePath String>, <charset String>, <csvFormat CSVFormat>)

// Parses CSV file using default charset and given csv format
<parsedResult CsvParser> := Csv.parse(<filePath String>, <csvFormat CSVFormat>)

// Parses CSV file using default charset and format
<parsedResult CsvParser> := Csv.parse(<filePath String>)

Parameters

parsedResult - The parsed CSV file
filePath - The CSV file's filePath
charset - The given charset
csvFormat - Specifies the CSV format

The CsvParser type also has the follow functions:

close() - Closes resources
<headerMap Map<String, Number>> := getHeaderMap() - Returns a copy of the header map that iterates in column order
<records List<CSVRecord>> := getRecords() - Parses the CSV input according to the given format and returns the content as a list of csvRecords

Example

stepdef "parse csv"

    format := Csv.getExcelFormat().withDelimiter(";")
    parser := Csv.parse("/Users/mycomputer/Desktop/Functions.csv", format)
    records := parser.getRecords()
    for record in records
        println("0: " + record.get(0) + "1: " + record.get(1))
        println("0: " + record.get("col0") + "1: " + record.get("col1"))
    end
end

CSV Format¶

The CSV format can be retrieved and modified using default formats or created with the newFormat function.

Get CSV Format¶

The get CSV Format functions assign a CSV specific format. Settings can be chained to these functions or defined separately to modify the format (for example, changing the default delimiters or quote characters).

Syntax

<defaultFormat CsvFormat> := Csv.getDefaultFormat()
<excelFormat CsvFormat>   := Csv.getExcelFormat()
<mySQLFormat CsvFormat>   := Csv.getMySQLFormat()
<RFC4180Format CsvFormat> := Csv.getRFC4180Format()
<TDFFormat CsvFormat>     := Csv.getTDFFormat()

Parameters

defaultFormat - The default CSV format
excelFormat - The Excel CSV format
mySQLFormat - The MySQL CSV format
RFC4180Format -The RFC4180 CSV format
TDFFFormat* - The TDFF CSV format

Example

1	excelFormat = Csv.getExcelFormat()

Creating a New Format¶

Csv.newFormat can be used instead of a getFormat function to create a new CSV format by specifying required settings.

Syntax

1	Csv.newFormat(<delimiter String>)

Parameter

delimiter - The delimiter applied to the new format

Example

1	newformat = Csv.newFormat(";")

Customizing the CSV Format¶

CSV formats can be customized by defining additional settings. If no setting is applied, the default is used.

Syntax

<format Format>
    .withCommentMarker(<commentMarker String>)
    .withCommentMarkerDisabled()
    .withDelimiter(<delimiterCharacter String>)
    .withEscape(<escapeCharacter String>)
    .withEscapeDisabled()
    .withHeader(<listOfHeaders List>)
    .withHeaderDisabled()
    .withAllowMissingColumnNames(<allowMissingColumnNames Boolean>)
    .withIgnoreEmptyLines(<ignoreEmptyLines Boolean>)
    .withIgnoreSurroundingSpaces(<ignoreSurroundingSpaces Boolean>)
    .withNullString(<nullString String>)
    .withQuote(<quoteCharacter String>)
    .withQuoteDisabled()
    .withSkipHeaderRecord(<skipHeaderRecord Boolean>)

Parameters

format - The CSV Format
commentMarker - The comment marker
- withCommentMarkerDisabled() - Disables the format's comment marker
delimiterCharacter - The format's delimiter
escapeCharacter - The format's escape character
- withEscapeDisabled() - Disables the format's escape character
listOfHeaders - The format's header
- withHeader([]) - Instructs the parser to read the first row of a CSV file as a header if it exists With this format, the column names can be used to access a record's values
- withHeaderDisabled() - Disables the format's header
allowMissingColumnNames -
- true if missing column names in the Header line, are allowed
- false otherwise
ignoreEmptyLines -
- true if empty lines must be ignored
- false otherwise
ignoreSurroundingSpaces -
- true if surrounding spaces are removed
- false otherwise
nullString - The null string
quoteCharacter - The format's quote character withQuoteDisabled() - Disables the format's quote character
skipHeaderRecord -
- true if header record is skipped
- false otherwise

Modifier Getters¶

Modifier Getters return values from CSV format.

Syntax

<commentMarker String>              := <format Format>.getCommentMarker()
<delimiter String>                  := <format Format>.getDelimiter()
<escapeCharacter String>            := <format Format>.getEscapeCharacter()
<headerList List<String>>           := <format Format>.getHeader()
<allowMissingColumnNames Boolean>   := <format Format>.getAllowMissingColumnNames()
<ignoreEmptyLines Boolean>          := <format Format>.getIgnoreEmptyLines()
<ignoreSurroundingSpaces Boolean>   := <format Format>.getIgnoreSurroundingSpaces()
<nullString String>                 := <format Format>.getNullString()
<quoteCharacter String>             := <format Format>.getQuoteCharacter()
<skipHeaderRecord Boolean>          := <format Format>.getSkipHeaderRecord()

Parameters

format - An instance of the format
commentMarker - The CSV format's comment marker
delimiter - The CSV format's delimiter
escapeCharacter - The CSV format's escape character
headerList - The list of headers
allowMissingColumnNames
- true if missing column names are allowed
- false otherwise
ignoreEmptyLines
- true if empty lines are ignored
- false otherwise
ignoreSurroundingSpaces
- true if surrounding spaces are ignored
- false otherwise
getNullString - The null string
quoteCharacter - The quote character
skipHeaderRecord -
- true if the header record is skipped
- false otherwise

Default Settings for CSV Formats¶

Settings and Defaults for getDefaultFormat()

withDelimiter(",")
withQuote(""")
withIgnoreEmptyLines(true)

Settings and Defaults for getExcelFormat()

withDelimiter(",")
withQuote('""')
withIgnoreEmptyLines(false)
withAllowMissingColumnNames(true)

Settings and Defaults for getMySQLFormat()

withDelimiter(" ") (tab character)
withQuoteDisabled()
withIgnoreEmptyLines(false)
withEscape("\")

Note: getMySQLFormat() gets a default MySQL format used by the SELECT INTO OUTFILE and LOAD DATA INFILE operations. This is a tab-delimited format with an LF character as the line separator. Values are not quoted.

Settings and Defaults for getRFC4180Format()

withDelimiter(",")
withQuote('""')
withIgnoreEmptyLines(false)

getRFC4180Format() gets a comma separated format as defined by RFC 4180. Settings are: Settings and Defaults for getTDFFormat (tab delimited)

withDelimiter(" ") (tab character)
withQuote('""')
withIgnoreSurroundingSpaces(true)

CSV Record Methods¶

The CSV record methods correspond to a CSV row parsed from a CSV file.

Syntax

<indexValue String>             := csvRecord.get(<index Number>)
<nameValue String>              := csvRecord.get(<columnName String>)
<comment String>                := csvRecord.getComment()
<recordNumber Number>           := csvRecord.getRecordNumber()
<isConsistent Boolean>          := csvRecord.isConsistent()
<isMapped Boolean>              := csvRecord.isMapped(<columnName String>)
<isSet Boolean>                 := csvRecord.isSet(<columnName String>)
<numValues Number>              := csvRecord.size()
<valueMap Map<String, String>>  := csvRecord.toMap()

Parameters

index - The column index (0-based)
columnName - The column name
indexValue - The returned index value
nameValue - The returned name value
comment - The returned comment
recordNumber - The number returned for the parsed CSV file's record
- If the CSV input has multi-line values, the returned number does not correspond to the current line number of the parser that created this record
isConsistent
- true if the column size and record match
- false otherwise
- Some programs can export files that will resolve to false but still produce parseable files
isMapped
- true if a given column is mapped
- false otherwise
isSet
- true if a given column is mapped and has a value
- false for a column that is not mapped
numValues - The number of values in the record
valueMap - The new map of the copied record

Example

records := parser.getRecords()
for record in records
    println("0: " + record.get(0) + "1: " + record.get(1))
    println("0: " + record.get("col0") + "1: " + record.get("col1"))
end