public final class CSVFormat extends Object implements Serializable
You can use one of the predefined formats:
For example:
CSVParser parser = CSVFormat.EXCEL.parse(reader);
The CSVParser
provides static methods to parse other input types, for example:
CSVParser parser = CSVParser.parse(file, StandardCharsets.US_ASCII, CSVFormat.EXCEL);
You can extend a format by calling the with
methods. For example:
CSVFormat.EXCEL.withNullString("N/A").withIgnoreSurroundingSpaces(true);
To define the column names you want to use to access records, write:
CSVFormat.EXCEL.withHeader("Col1", "Col2", "Col3");
Calling withHeader(String...)
let's you use the given names to address values in a CSVRecord
, and
assumes that your CSV source does not contain a first record that also defines column names.
If it does, then you are overriding this metadata with your names and you should skip the first record by calling
withSkipHeaderRecord(boolean)
with true
.
You can use a format directly to parse a reader. For example, to parse an Excel file with columns header, write:
Reader in = ...; CSVFormat.EXCEL.withHeader("Col1", "Col2", "Col3").parse(in);
For other input types, like resources, files, and URLs, use the static methods on CSVParser
.
If your source contains a header record, you can simplify your code and safely reference columns, by using
withHeader(String...)
with no arguments:
CSVFormat.EXCEL.withHeader();
This causes the parser to read the first record and use its values as column names.
Then, call one of the CSVRecord
get method that takes a String column name argument:
String value = record.get("Col1");
This makes your code impervious to changes in column order in the CSV file.
This class is immutable.
Modifier and Type | Class and Description |
---|---|
static class |
CSVFormat.Predefined
Predefines formats.
|
Modifier and Type | Field and Description |
---|---|
static CSVFormat |
DEFAULT
Standard comma separated format, as for
RFC4180 but allowing empty lines. |
static CSVFormat |
EXCEL
Excel file format (using a comma as the value delimiter).
|
static CSVFormat |
INFORMIX_UNLOAD
Default Informix CSV UNLOAD format used by the
UNLOAD TO file_name operation. |
static CSVFormat |
INFORMIX_UNLOAD_CSV
Default Informix CSV UNLOAD format used by the
UNLOAD TO file_name operation (escaping is disabled.) |
static CSVFormat |
MYSQL
Default MySQL format used by the
SELECT INTO OUTFILE and LOAD DATA INFILE operations. |
static CSVFormat |
POSTGRESQL_CSV
Default PostgreSQL CSV format used by the
COPY operation. |
static CSVFormat |
POSTGRESQL_TEXT
Default PostgreSQL text format used by the
COPY operation. |
static CSVFormat |
RFC4180
Comma separated format as defined by RFC 4180.
|
static CSVFormat |
TDF
Tab-delimited format.
|
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj) |
String |
format(Object... values)
Formats the specified values.
|
boolean |
getAllowMissingColumnNames()
Specifies whether missing column names are allowed when parsing the header line.
|
Character |
getCommentMarker()
Returns the character marking the start of a line comment.
|
char |
getDelimiter()
Returns the character delimiting the values (typically ';', ',' or '\t').
|
Character |
getEscapeCharacter()
Returns the escape character.
|
String[] |
getHeader()
Returns a copy of the header array.
|
String[] |
getHeaderComments()
Returns a copy of the header comment array.
|
boolean |
getIgnoreEmptyLines()
Specifies whether empty lines between records are ignored when parsing input.
|
boolean |
getIgnoreHeaderCase()
Specifies whether header names will be accessed ignoring case.
|
boolean |
getIgnoreSurroundingSpaces()
Specifies whether spaces around values are ignored when parsing input.
|
String |
getNullString()
Gets the String to convert to and from
null . |
Character |
getQuoteCharacter()
Returns the character used to encapsulate values containing special characters.
|
QuoteMode |
getQuoteMode()
Returns the quote policy output fields.
|
String |
getRecordSeparator()
Returns the record separator delimiting output records.
|
boolean |
getSkipHeaderRecord()
Returns whether to skip the header record.
|
boolean |
getTrailingDelimiter()
Returns whether to add a trailing delimiter.
|
boolean |
getTrim()
Returns whether to trim leading and trailing blanks.
|
int |
hashCode() |
boolean |
isCommentMarkerSet()
Specifies whether comments are supported by this format.
|
boolean |
isEscapeCharacterSet()
Returns whether escape are being processed.
|
boolean |
isNullStringSet()
Returns whether a nullString has been defined.
|
boolean |
isQuoteCharacterSet()
Returns whether a quoteChar has been defined.
|
static CSVFormat |
newFormat(char delimiter)
Creates a new CSV format with the specified delimiter.
|
CSVParser |
parse(Reader in)
Parses the specified content.
|
CSVPrinter |
print(Appendable out)
Prints to the specified output.
|
CSVPrinter |
print(File out,
Charset charset)
Prints to the specified output.
|
void |
print(Object value,
Appendable out,
boolean newRecord)
Prints the
value as the next value on the line to out . |
CSVPrinter |
print(Path out,
Charset charset)
Prints to the specified output.
|
CSVPrinter |
printer()
Prints to the
System.out . |
void |
println(Appendable out)
Outputs the trailing delimiter (if set) followed by the record separator (if set).
|
void |
printRecord(Appendable out,
Object... values)
Prints the given
values to out as a single record of delimiter separated values followed by the
record separator. |
String |
toString() |
static CSVFormat |
valueOf(String format)
Gets one of the predefined formats from
CSVFormat.Predefined . |
CSVFormat |
withAllowMissingColumnNames()
Returns a new
CSVFormat with the missing column names behavior of the format set to true |
CSVFormat |
withAllowMissingColumnNames(boolean allowMissingColumnNames)
Returns a new
CSVFormat with the missing column names behavior of the format set to the given value. |
CSVFormat |
withCommentMarker(char commentMarker)
Returns a new
CSVFormat with the comment start marker of the format set to the specified character. |
CSVFormat |
withCommentMarker(Character commentMarker)
Returns a new
CSVFormat with the comment start marker of the format set to the specified character. |
CSVFormat |
withDelimiter(char delimiter)
Returns a new
CSVFormat with the delimiter of the format set to the specified character. |
CSVFormat |
withEscape(char escape)
Returns a new
CSVFormat with the escape character of the format set to the specified character. |
CSVFormat |
withEscape(Character escape)
Returns a new
CSVFormat with the escape character of the format set to the specified character. |
CSVFormat |
withFirstRecordAsHeader()
Returns a new
CSVFormat using the first record as header. |
CSVFormat |
withHeader(Class<? extends Enum<?>> headerEnum)
Returns a new
CSVFormat with the header of the format defined by the enum class. |
CSVFormat |
withHeader(ResultSet resultSet)
Returns a new
CSVFormat with the header of the format set from the result set metadata. |
CSVFormat |
withHeader(ResultSetMetaData metaData)
Returns a new
CSVFormat with the header of the format set from the result set metadata. |
CSVFormat |
withHeader(String... header)
Returns a new
CSVFormat with the header of the format set to the given values. |
CSVFormat |
withHeaderComments(Object... headerComments)
Returns a new
CSVFormat with the header comments of the format set to the given values. |
CSVFormat |
withIgnoreEmptyLines()
Returns a new
CSVFormat with the empty line skipping behavior of the format set to true . |
CSVFormat |
withIgnoreEmptyLines(boolean ignoreEmptyLines)
Returns a new
CSVFormat with the empty line skipping behavior of the format set to the given value. |
CSVFormat |
withIgnoreHeaderCase()
Returns a new
CSVFormat with the header ignore case behavior set to true . |
CSVFormat |
withIgnoreHeaderCase(boolean ignoreHeaderCase)
Returns a new
CSVFormat with whether header names should be accessed ignoring case. |
CSVFormat |
withIgnoreSurroundingSpaces()
Returns a new
CSVFormat with the trimming behavior of the format set to true . |
CSVFormat |
withIgnoreSurroundingSpaces(boolean ignoreSurroundingSpaces)
Returns a new
CSVFormat with the trimming behavior of the format set to the given value. |
CSVFormat |
withNullString(String nullString)
Returns a new
CSVFormat with conversions to and from null for strings on input and output. |
CSVFormat |
withQuote(char quoteChar)
Returns a new
CSVFormat with the quoteChar of the format set to the specified character. |
CSVFormat |
withQuote(Character quoteChar)
Returns a new
CSVFormat with the quoteChar of the format set to the specified character. |
CSVFormat |
withQuoteMode(QuoteMode quoteModePolicy)
Returns a new
CSVFormat with the output quote policy of the format set to the specified value. |
CSVFormat |
withRecordSeparator(char recordSeparator)
Returns a new
CSVFormat with the record separator of the format set to the specified character. |
CSVFormat |
withRecordSeparator(String recordSeparator)
Returns a new
CSVFormat with the record separator of the format set to the specified String. |
CSVFormat |
withSkipHeaderRecord()
Returns a new
CSVFormat with skipping the header record set to true . |
CSVFormat |
withSkipHeaderRecord(boolean skipHeaderRecord)
Returns a new
CSVFormat with whether to skip the header record. |
CSVFormat |
withTrailingDelimiter()
Returns a new
CSVFormat to add a trailing delimiter. |
CSVFormat |
withTrailingDelimiter(boolean trailingDelimiter)
Returns a new
CSVFormat with whether to add a trailing delimiter. |
CSVFormat |
withTrim()
Returns a new
CSVFormat to trim leading and trailing blanks. |
CSVFormat |
withTrim(boolean trim)
Returns a new
CSVFormat with whether to trim leading and trailing blanks. |
public static final CSVFormat DEFAULT
RFC4180
but allowing empty lines.
Settings are:
CSVFormat.Predefined.Default
public static final CSVFormat EXCEL
For example for parsing or generating a CSV file on a French system the following format will be used:
CSVFormat fmt = CSVFormat.EXCEL.withDelimiter(';');
Settings are:
withDelimiter(',')
withQuote('"')
withRecordSeparator("\r\n")
withIgnoreEmptyLines(false)
withAllowMissingColumnNames(true)
Note: this is currently like RFC4180
plus withAllowMissingColumnNames(true)
.
CSVFormat.Predefined.Excel
public static final CSVFormat INFORMIX_UNLOAD
UNLOAD TO file_name
operation.
This is a comma-delimited format with a LF character as the line separator. Values are not quoted and special
characters are escaped with '\'
. The default NULL string is "\\N"
.
Settings are:
public static final CSVFormat INFORMIX_UNLOAD_CSV
UNLOAD TO file_name
operation (escaping is disabled.)
This is a comma-delimited format with a LF character as the line separator. Values are not quoted and special
characters are escaped with '\'
. The default NULL string is "\\N"
.
Settings are:
public static final CSVFormat MYSQL
SELECT INTO OUTFILE
and LOAD DATA INFILE
operations.
This is a tab-delimited format with a LF character as the line separator. Values are not quoted and special
characters are escaped with '\'
. The default NULL string is "\\N"
.
Settings are:
public static final CSVFormat POSTGRESQL_CSV
COPY
operation.
This is a comma-delimited format with a LF character as the line separator. Values are double quoted and special
characters are escaped with '"'
. The default NULL string is ""
.
Settings are:
CSVFormat.Predefined.MySQL
,
http://dev.mysql.com/doc/refman/5.1/en/load
-data.htmlpublic static final CSVFormat POSTGRESQL_TEXT
COPY
operation.
This is a tab-delimited format with a LF character as the line separator. Values are double quoted and special
characters are escaped with '"'
. The default NULL string is "\\N"
.
Settings are:
CSVFormat.Predefined.MySQL
,
http://dev.mysql.com/doc/refman/5.1/en/load
-data.htmlpublic static final CSVFormat RFC4180
Settings are:
CSVFormat.Predefined.RFC4180
public static final CSVFormat TDF
Settings are:
CSVFormat.Predefined.TDF
public static CSVFormat newFormat(char delimiter)
Use this method if you want to create a CSVFormat from scratch. All fields but the delimiter will be initialized with null/false.
public static CSVFormat valueOf(String format)
CSVFormat.Predefined
.format
- namepublic String format(Object... values)
values
- the values to formatpublic boolean getAllowMissingColumnNames()
true
if missing column names are allowed when parsing the header line, false
to throw an
IllegalArgumentException
.public Character getCommentMarker()
null
public char getDelimiter()
public Character getEscapeCharacter()
null
public String[] getHeader()
null
if disabled, the empty array if to be read from the filepublic String[] getHeaderComments()
null
if disabled.public boolean getIgnoreEmptyLines()
true
if empty lines between records are ignored, false
if they are turned into empty
records.public boolean getIgnoreHeaderCase()
true
if header names cases are ignored, false
if they are case sensitive.public boolean getIgnoreSurroundingSpaces()
true
if spaces around values are ignored, false
if they are treated as part of the value.public String getNullString()
null
.
nullString
to null
when reading
records.null
as the given nullString
when writing records.null
. No substitution occurs if null
public Character getQuoteCharacter()
null
public QuoteMode getQuoteMode()
public String getRecordSeparator()
public boolean getSkipHeaderRecord()
public boolean getTrailingDelimiter()
public boolean getTrim()
public boolean isCommentMarkerSet()
true
is comments are supported, false
otherwisepublic boolean isEscapeCharacterSet()
true
if escapes are processedpublic boolean isNullStringSet()
true
if a nullString is definedpublic boolean isQuoteCharacterSet()
true
if a quoteChar is definedpublic CSVParser parse(Reader in) throws IOException
See also the various static parse methods on CSVParser
.
in
- the input streamCSVRecord
s.IOException
- If an I/O error occurspublic CSVPrinter print(Appendable out) throws IOException
See also CSVPrinter
.
out
- the output.IOException
- thrown if the optional header cannot be printed.public CSVPrinter printer() throws IOException
System.out
.
See also CSVPrinter
.
System.out
.IOException
- thrown if the optional header cannot be printed.public CSVPrinter print(File out, Charset charset) throws IOException
See also CSVPrinter
.
out
- the output.charset
- A charset.IOException
- thrown if the optional header cannot be printed.public CSVPrinter print(Path out, Charset charset) throws IOException
See also CSVPrinter
.
out
- the output.charset
- A charset.IOException
- thrown if the optional header cannot be printed.public void print(Object value, Appendable out, boolean newRecord) throws IOException
value
as the next value on the line to out
. The value will be escaped or encapsulated
as needed. Useful when one wants to avoid creating CSVPrinters.value
- value to output.out
- where to print the value.newRecord
- if this a new record.IOException
- If an I/O error occurs.public void println(Appendable out) throws IOException
out
- where to writeIOException
- If an I/O error occurspublic void printRecord(Appendable out, Object... values) throws IOException
values
to out
as a single record of delimiter separated values followed by the
record separator.
The values will be quoted if needed. Quotes and new-line characters will be escaped. This method adds the record
separator to the output after printing the record, so there is no need to call println(Appendable)
.
out
- where to write.values
- values to output.IOException
- If an I/O error occurs.public CSVFormat withAllowMissingColumnNames()
CSVFormat
with the missing column names behavior of the format set to true
withAllowMissingColumnNames(boolean)
public CSVFormat withAllowMissingColumnNames(boolean allowMissingColumnNames)
CSVFormat
with the missing column names behavior of the format set to the given value.allowMissingColumnNames
- the missing column names behavior, true
to allow missing column names in the header line,
false
to cause an IllegalArgumentException
to be thrown.public CSVFormat withCommentMarker(char commentMarker)
CSVFormat
with the comment start marker of the format set to the specified character.
Note that the comment start character is only recognized at the start of a line.commentMarker
- the comment start markerIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withCommentMarker(Character commentMarker)
CSVFormat
with the comment start marker of the format set to the specified character.
Note that the comment start character is only recognized at the start of a line.commentMarker
- the comment start marker, use null
to disableIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withDelimiter(char delimiter)
CSVFormat
with the delimiter of the format set to the specified character.delimiter
- the delimiter characterIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withEscape(char escape)
CSVFormat
with the escape character of the format set to the specified character.escape
- the escape characterIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withEscape(Character escape)
CSVFormat
with the escape character of the format set to the specified character.escape
- the escape character, use null
to disableIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withFirstRecordAsHeader()
CSVFormat
using the first record as header.
Calling this method is equivalent to calling:
CSVFormat format = aFormat.withHeader().withSkipHeaderRecord();
withSkipHeaderRecord(boolean)
,
withHeader(String...)
public CSVFormat withHeader(Class<? extends Enum<?>> headerEnum)
CSVFormat
with the header of the format defined by the enum class.
Example:
public enum Header { Name, Email, Phone } CSVFormat format = aformat.withHeader(Header.class);
The header is also used by the CSVPrinter
.
headerEnum
- the enum defining the header, null
if disabled, empty if parsed automatically, user specified
otherwise.withHeader(String...)
,
withSkipHeaderRecord(boolean)
public CSVFormat withHeader(ResultSet resultSet) throws SQLException
CSVFormat
with the header of the format set from the result set metadata. The header can
either be parsed automatically from the input file with:
CSVFormat format = aformat.withHeader();or specified manually with:
CSVFormat format = aformat.withHeader(resultSet);
The header is also used by the CSVPrinter
.
resultSet
- the resultSet for the header, null
if disabled, empty if parsed automatically, user specified
otherwise.SQLException
- SQLException if a database access error occurs or this method is called on a closed result set.public CSVFormat withHeader(ResultSetMetaData metaData) throws SQLException
CSVFormat
with the header of the format set from the result set metadata. The header can
either be parsed automatically from the input file with:
CSVFormat format = aformat.withHeader();or specified manually with:
CSVFormat format = aformat.withHeader(metaData);
The header is also used by the CSVPrinter
.
metaData
- the metaData for the header, null
if disabled, empty if parsed automatically, user specified
otherwise.SQLException
- SQLException if a database access error occurs or this method is called on a closed result set.public CSVFormat withHeader(String... header)
CSVFormat
with the header of the format set to the given values. The header can either be
parsed automatically from the input file with:
CSVFormat format = aformat.withHeader();or specified manually with:
CSVFormat format = aformat.withHeader("name", "email", "phone");
The header is also used by the CSVPrinter
.
header
- the header, null
if disabled, empty if parsed automatically, user specified otherwise.withSkipHeaderRecord(boolean)
public CSVFormat withHeaderComments(Object... headerComments)
CSVFormat
with the header comments of the format set to the given values. The comments will
be printed first, before the headers. This setting is ignored by the parser.
CSVFormat format = aformat.withHeaderComments("Generated by Apache Commons CSV 1.1.", new Date());
headerComments
- the headerComments which will be printed by the Printer before the actual CSV data.withSkipHeaderRecord(boolean)
public CSVFormat withIgnoreEmptyLines()
CSVFormat
with the empty line skipping behavior of the format set to true
.withIgnoreEmptyLines(boolean)
, 1.1public CSVFormat withIgnoreEmptyLines(boolean ignoreEmptyLines)
CSVFormat
with the empty line skipping behavior of the format set to the given value.ignoreEmptyLines
- the empty line skipping behavior, true
to ignore the empty lines between the records,
false
to translate empty lines to empty records.public CSVFormat withIgnoreHeaderCase()
CSVFormat
with the header ignore case behavior set to true
.withIgnoreHeaderCase(boolean)
public CSVFormat withIgnoreHeaderCase(boolean ignoreHeaderCase)
CSVFormat
with whether header names should be accessed ignoring case.ignoreHeaderCase
- the case mapping behavior, true
to access name/values, false
to leave the mapping as
is.true
public CSVFormat withIgnoreSurroundingSpaces()
CSVFormat
with the trimming behavior of the format set to true
.withIgnoreSurroundingSpaces(boolean)
public CSVFormat withIgnoreSurroundingSpaces(boolean ignoreSurroundingSpaces)
CSVFormat
with the trimming behavior of the format set to the given value.ignoreSurroundingSpaces
- the trimming behavior, true
to remove the surrounding spaces, false
to leave the
spaces as is.public CSVFormat withNullString(String nullString)
CSVFormat
with conversions to and from null for strings on input and output.
nullString
to null
when reading
records.null
as the given nullString
when writing records.nullString
- the String to convert to and from null
. No substitution occurs if null
public CSVFormat withQuote(char quoteChar)
CSVFormat
with the quoteChar of the format set to the specified character.quoteChar
- the quoteChar characterIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withQuote(Character quoteChar)
CSVFormat
with the quoteChar of the format set to the specified character.quoteChar
- the quoteChar character, use null
to disableIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withQuoteMode(QuoteMode quoteModePolicy)
CSVFormat
with the output quote policy of the format set to the specified value.quoteModePolicy
- the quote policy to use for output.public CSVFormat withRecordSeparator(char recordSeparator)
CSVFormat
with the record separator of the format set to the specified character.
Note: This setting is only used during printing and does not affect parsing. Parsing currently only works for inputs with '\n', '\r' and "\r\n"
recordSeparator
- the record separator to use for output.public CSVFormat withRecordSeparator(String recordSeparator)
CSVFormat
with the record separator of the format set to the specified String.
Note: This setting is only used during printing and does not affect parsing. Parsing currently only works for inputs with '\n', '\r' and "\r\n"
recordSeparator
- the record separator to use for output.IllegalArgumentException
- if recordSeparator is none of CR, LF or CRLFpublic CSVFormat withSkipHeaderRecord()
CSVFormat
with skipping the header record set to true
.withSkipHeaderRecord(boolean)
,
withHeader(String...)
public CSVFormat withSkipHeaderRecord(boolean skipHeaderRecord)
CSVFormat
with whether to skip the header record.skipHeaderRecord
- whether to skip the header record.withHeader(String...)
public CSVFormat withTrailingDelimiter()
CSVFormat
to add a trailing delimiter.public CSVFormat withTrailingDelimiter(boolean trailingDelimiter)
CSVFormat
with whether to add a trailing delimiter.trailingDelimiter
- whether to add a trailing delimiter.public CSVFormat withTrim()
CSVFormat
to trim leading and trailing blanks.public CSVFormat withTrim(boolean trim)
CSVFormat
with whether to trim leading and trailing blanks.trim
- whether to trim leading and trailing blanks.Copyright © 2017. All rights reserved.