CSV Built-ins ¶
CSV Built-ins provide functionality for parsing CSV files and accessing content. They provide three parsing methods: filePath, charset, and csvFormat.
Syntax
1 2 3 4 5 6 7 8 | // Parses CSV file using given charset and csv format <parsedResult CsvParser> := Csv.parse(<filePath String>, <charset String>, <csvFormat CSVFormat>) // Parses CSV file using default charset and given csv format <parsedResult CsvParser> := Csv.parse(<filePath String>, <csvFormat CSVFormat>) // Parses CSV file using default charset and format <parsedResult CsvParser> := Csv.parse(<filePath String>) |
Parameters
-
parsedResult - The parsed CSV file
-
filePath - The CSV file's filePath
-
charset - The given charset
-
csvFormat - Specifies the CSV format
The CsvParser
type also has the follow functions:
-
close() - Closes resources
-
<headerMap Map<String, Number>> := getHeaderMap() - Returns a copy of the header map that iterates in column order
-
<records List<CSVRecord>> := getRecords() - Parses the CSV input according to the given format and returns the content as a list of
csvRecords
Example
1 2 3 4 5 6 7 8 9 10 | stepdef "parse csv" format := Csv.getExcelFormat().withDelimiter(";") parser := Csv.parse("/Users/mycomputer/Desktop/Functions.csv", format) records := parser.getRecords() for record in records println("0: " + record.get(0) + "1: " + record.get(1)) println("0: " + record.get("col0") + "1: " + record.get("col1")) end end |
CSV Format¶
The CSV format can be retrieved and modified using default formats or created with the newFormat
function.
Get CSV Format¶
The get CSV Format functions assign a CSV specific format. Settings can be chained to these functions or defined separately to modify the format (for example, changing the default delimiters or quote characters).
Syntax
1 2 3 4 5 | <defaultFormat CsvFormat> := Csv.getDefaultFormat() <excelFormat CsvFormat> := Csv.getExcelFormat() <mySQLFormat CsvFormat> := Csv.getMySQLFormat() <RFC4180Format CsvFormat> := Csv.getRFC4180Format() <TDFFormat CsvFormat> := Csv.getTDFFormat() |
Parameters
-
defaultFormat - The default CSV format
-
excelFormat - The Excel CSV format
-
mySQLFormat - The MySQL CSV format
-
RFC4180Format -The RFC4180 CSV format
-
TDFFFormat* - The TDFF CSV format
Example
1 | excelFormat = Csv.getExcelFormat() |
Creating a New Format¶
Csv.newFormat
can be used instead of a getFormat
function to create a new CSV format by specifying required settings.
Syntax
1 | Csv.newFormat(<delimiter String>) |
Parameter
- delimiter - The delimiter applied to the new format
Example
1 | newformat = Csv.newFormat(";") |
Customizing the CSV Format¶
CSV formats can be customized by defining additional settings. If no setting is applied, the default is used.
Syntax
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | <format Format> .withCommentMarker(<commentMarker String>) .withCommentMarkerDisabled() .withDelimiter(<delimiterCharacter String>) .withEscape(<escapeCharacter String>) .withEscapeDisabled() .withHeader(<listOfHeaders List>) .withHeaderDisabled() .withAllowMissingColumnNames(<allowMissingColumnNames Boolean>) .withIgnoreEmptyLines(<ignoreEmptyLines Boolean>) .withIgnoreSurroundingSpaces(<ignoreSurroundingSpaces Boolean>) .withNullString(<nullString String>) .withQuote(<quoteCharacter String>) .withQuoteDisabled() .withSkipHeaderRecord(<skipHeaderRecord Boolean>) |
Parameters
-
format - The CSV Format
-
commentMarker - The comment marker
- withCommentMarkerDisabled() - Disables the format's comment marker
-
delimiterCharacter - The format's delimiter
-
escapeCharacter - The format's escape character
- withEscapeDisabled() - Disables the format's escape character
-
listOfHeaders - The format's header
withHeader([])
- Instructs the parser to read the first row of a CSV file as a header if it exists With this format, the column names can be used to access a record's values- withHeaderDisabled() - Disables the format's header
-
allowMissingColumnNames -
true
if missing column names in the Header line, are allowedfalse
otherwise
-
ignoreEmptyLines -
true
if empty lines must be ignoredfalse
otherwise
-
ignoreSurroundingSpaces -
true
if surrounding spaces are removedfalse
otherwise
-
nullString - The null string
-
quoteCharacter - The format's quote character withQuoteDisabled() - Disables the format's quote character
-
skipHeaderRecord -
true
if header record is skippedfalse
otherwise
Modifier Getters¶
Modifier Getters return values from CSV format.
Syntax
1 2 3 4 5 6 7 8 9 10 | <commentMarker String> := <format Format>.getCommentMarker() <delimiter String> := <format Format>.getDelimiter() <escapeCharacter String> := <format Format>.getEscapeCharacter() <headerList List<String>> := <format Format>.getHeader() <allowMissingColumnNames Boolean> := <format Format>.getAllowMissingColumnNames() <ignoreEmptyLines Boolean> := <format Format>.getIgnoreEmptyLines() <ignoreSurroundingSpaces Boolean> := <format Format>.getIgnoreSurroundingSpaces() <nullString String> := <format Format>.getNullString() <quoteCharacter String> := <format Format>.getQuoteCharacter() <skipHeaderRecord Boolean> := <format Format>.getSkipHeaderRecord() |
Parameters
-
format - An instance of the format
-
commentMarker - The CSV format's comment marker
-
delimiter - The CSV format's delimiter
-
escapeCharacter - The CSV format's escape character
-
headerList - The list of headers
-
allowMissingColumnNames
true
if missing column names are allowedfalse
otherwise
-
ignoreEmptyLines
true
if empty lines are ignoredfalse
otherwise
-
ignoreSurroundingSpaces
true
if surrounding spaces are ignoredfalse
otherwise
-
getNullString - The null string
-
quoteCharacter - The quote character
-
skipHeaderRecord -
true
if the header record is skippedfalse
otherwise
Default Settings for CSV Formats¶
Settings and Defaults for getDefaultFormat()
-
withDelimiter(",")
-
withQuote(""")
-
withIgnoreEmptyLines(true)
Settings and Defaults for getExcelFormat()
-
withDelimiter(",")
-
withQuote('""')
-
withIgnoreEmptyLines(false)
-
withAllowMissingColumnNames(true)
Settings and Defaults for getMySQLFormat()
-
withDelimiter(" ")
(tab character) -
withQuoteDisabled()
-
withIgnoreEmptyLines(false)
-
withEscape("\")
Note:
getMySQLFormat()
gets a default MySQL format used by the SELECT INTO OUTFILE
and LOAD DATA INFILE
operations. This is a tab-delimited format with an LF
character as the line separator. Values are not quoted.
Settings and Defaults for getRFC4180Format()
-
withDelimiter(",")
-
withQuote('""')
-
withIgnoreEmptyLines(false)
getRFC4180Format()
gets a comma separated format as defined by RFC 4180. Settings are:
Settings and Defaults for getTDFFormat (tab delimited)
-
withDelimiter(" ")
(tab character) -
withQuote('""')
-
withIgnoreSurroundingSpaces(true)
CSV Record Methods¶
The CSV record methods correspond to a CSV row parsed from a CSV file.
Syntax
1 2 3 4 5 6 7 8 9 | <indexValue String> := csvRecord.get(<index Number>) <nameValue String> := csvRecord.get(<columnName String>) <comment String> := csvRecord.getComment() <recordNumber Number> := csvRecord.getRecordNumber() <isConsistent Boolean> := csvRecord.isConsistent() <isMapped Boolean> := csvRecord.isMapped(<columnName String>) <isSet Boolean> := csvRecord.isSet(<columnName String>) <numValues Number> := csvRecord.size() <valueMap Map<String, String>> := csvRecord.toMap() |
Parameters
-
index - The column index (0-based)
-
columnName - The column name
-
indexValue - The returned index value
-
nameValue - The returned name value
-
comment - The returned comment
-
recordNumber - The number returned for the parsed CSV file's record
- If the CSV input has multi-line values, the returned number does not correspond to the current line number of the parser that created this record
-
isConsistent
true
if the column size and record matchfalse
otherwise- Some programs can export files that will resolve to
false
but still produce parseable files
-
isMapped
true
if a given column is mappedfalse
otherwise
-
isSet
true
if a given column is mapped and has a valuefalse
for a column that is not mapped
-
numValues - The number of values in the record
-
valueMap - The new map of the copied record
Example
1 2 3 4 5 | records := parser.getRecords() for record in records println("0: " + record.get(0) + "1: " + record.get(1)) println("0: " + record.get("col0") + "1: " + record.get("col1")) end |