## Class CsvFile

• public final class CsvFile
extends Object
A CSV file.

Represents a CSV file together with the ability to parse it from a CharSource. The separator may be specified, allowing TSV files (tab-separated) and other similar formats to be parsed.

This class loads the entire CSV file into memory. To process the CSV file row-by-row, use CsvIterator.

The CSV file format is a general-purpose comma-separated value format. The format is parsed line-by-line, with lines separated by CR, LF or CRLF. Each line can contain one or more fields. Each field is separated by a comma character (,) or tab. Any field may be quoted using a double quote at the start and end. A quoted field may additionally be prefixed by an equals sign. The content of a quoted field may include commas and additional double quotes. Two adjacent double quotes in a quoted field will be replaced by a single double quote. Quoted fields are not trimmed. Non-quoted fields are trimmed.

The first line may be treated as a header row. The header row is accessed separately from the data rows.

Blank lines are ignored. Lines may be commented with has '#' or semicolon ';'.

• ### Method Summary

All Methods
Modifier and Type Method Description
boolean containsHeader​(String header)
Checks if the header is present in the file.
boolean containsHeader​(Pattern headerPattern)
Checks if the header pattern is present in the file.
boolean containsHeaders​(Collection<String> headers)
Checks if the headers are present in the file.
boolean equals​(Object obj)
Checks if this CSV file equals another.
static char findSeparator​(CharSource source)
Finds the separator used by the specified CSV file.
int hashCode()
Returns a suitable hash code for the CSV file.
ImmutableList<String> headers()
static CsvFile of​(CharSource source, boolean headerRow)
Parses the specified source as a CSV file, using a comma as the separator.
static CsvFile of​(CharSource source, boolean headerRow, char separator)
Parses the specified source as a CSV file where the separator is specified and might not be a comma.
static CsvFile of​(Reader reader, boolean headerRow)
Parses the specified reader as a CSV file, using a comma as the separator.
static CsvFile of​(Reader reader, boolean headerRow, char separator)
Parses the specified reader as a CSV file where the separator is specified and might not be a comma.
static CsvFile of​(List<String> headers, List<? extends List<String>> rows)
Obtains an instance from a list of headers and rows.
CsvRow row​(int index)
Gets a single row.
int rowCount()
Gets the number of data rows.
ImmutableList<CsvRow> rows()
Gets all data rows in the file.
String toString()
Returns a string describing the CSV file.
CsvFile withHeaders​(List<String> headers)
Returns an instance with the specified headers.
• ### Methods inherited from class java.lang.Object

clone, finalize, getClass, notify, notifyAll, wait, wait, wait
• ### Method Detail

• #### of

public static CsvFile of​(CharSource source,
boolean headerRow)
Parses the specified source as a CSV file, using a comma as the separator.

CSV files sometimes contain a Unicode Byte Order Mark. Callers are responsible for handling this, such as by using UnicodeBom.

Parameters:
source - the CSV file resource
headerRow - whether the source has a header row, an empty source must still contain the header
Returns:
the CSV file
Throws:
UncheckedIOException - if an IO exception occurs
IllegalArgumentException - if the file cannot be parsed
• #### of

public static CsvFile of​(CharSource source,
char separator)
Parses the specified source as a CSV file where the separator is specified and might not be a comma.

This overload allows the separator to be controlled. For example, a tab-separated file is very similar to a CSV file, the only difference is the separator.

CSV files sometimes contain a Unicode Byte Order Mark. Callers are responsible for handling this, such as by using UnicodeBom.

Parameters:
source - the file resource
headerRow - whether the source has a header row, an empty source must still contain the header
separator - the separator used to separate each field, typically a comma, but a tab is sometimes used
Returns:
the CSV file
Throws:
UncheckedIOException - if an IO exception occurs
IllegalArgumentException - if the file cannot be parsed
• #### of

public static CsvFile of​(Reader reader,
boolean headerRow)
Parses the specified reader as a CSV file, using a comma as the separator.

This factory method takes a Reader. Callers are encouraged to use CharSource instead of Reader as it allows the resource to be safely managed.

This factory method allows the separator to be controlled. For example, a tab-separated file is very similar to a CSV file, the only difference is the separator.

CSV files sometimes contain a Unicode Byte Order Mark. Callers are responsible for handling this, such as by using UnicodeBom.

Parameters:
reader - the file resource
headerRow - whether the source has a header row, an empty source must still contain the header
Returns:
the CSV file
Throws:
UncheckedIOException - if an IO exception occurs
IllegalArgumentException - if the file cannot be parsed
• #### of

public static CsvFile of​(Reader reader,
char separator)
Parses the specified reader as a CSV file where the separator is specified and might not be a comma.

This factory method takes a Reader. Callers are encouraged to use CharSource instead of Reader as it allows the resource to be safely managed.

This factory method allows the separator to be controlled. For example, a tab-separated file is very similar to a CSV file, the only difference is the separator.

CSV files sometimes contain a Unicode Byte Order Mark. Callers are responsible for handling this, such as by using UnicodeBom.

Parameters:
reader - the file resource
headerRow - whether the source has a header row, an empty source must still contain the header
separator - the separator used to separate each field, typically a comma, but a tab is sometimes used
Returns:
the CSV file
Throws:
UncheckedIOException - if an IO exception occurs
IllegalArgumentException - if the file cannot be parsed
• #### findSeparator

public static char findSeparator​(CharSource source)
Finds the separator used by the specified CSV file.

The search includes comma, semicolon, colon, tab and pipe (in that order of priority).

The algorithm operates in a number of steps. Firstly, it looks for occurrences where a separator is followed by valid quoted text. If this matches, the separator is assumed to be correct. Secondly, it looks for lines that only consist of a separator. If this matches, the separator is assumed to be correct. Thirdly, it looks to see which separator is the most common on the line. If that separator is also the most common on the next line, and the number of columns matches, the separator is assumed to be correct. Otherwise another line is processed. Thus to match a separator, there must be two lines with the same number of columns. At most, 100 content lines are read from the file. The default is comma if the file is empty.

Parameters:
source - the source to read as CSV
Returns:
the CSV file
Throws:
UncheckedIOException - if an IO exception occurs
IllegalArgumentException - if the file cannot be parsed
• #### of

public static CsvFile of​(List<String> headers,
List<? extends List<String>> rows)
Obtains an instance from a list of headers and rows.

The headers may be an empty list. All the rows must contain a list of the same size, matching the header if present.

Parameters:
headers - the headers, empty if no headers
rows - the data rows
Returns:
the CSV file
Throws:
IllegalArgumentException - if the rows do not match the headers

public ImmutableList<String> headers()

If there is no header row, an empty list is returned.

Returns:
• #### rows

public ImmutableList<CsvRow> rows()
Gets all data rows in the file.
Returns:
the data rows
• #### rowCount

public int rowCount()
Gets the number of data rows.
Returns:
the number of data rows
• #### row

public CsvRow row​(int index)
Gets a single row.
Parameters:
index - the row index, zero-based
Returns:
the row

public boolean containsHeader​(String header)
Checks if the header is present in the file.

Matching is case insensitive.

Parameters:
header - the column header to match
Returns:
true if the header is present

public boolean containsHeaders​(Collection<String> headers)
Checks if the headers are present in the file.

Matching is case insensitive.

Parameters:
headers - the column headers to match
Returns:
true if all the headers are present

public boolean containsHeader​(Pattern headerPattern)
Checks if the header pattern is present in the file.

Matching is case insensitive.

Parameters:
headerPattern - the header pattern to match
Returns:
true if the header is present

public CsvFile withHeaders​(List<String> headers)
Returns an instance with the specified headers.
Parameters:
headers - the new headers
Returns:
the instance with the specified headers
• #### equals

public boolean equals​(Object obj)
Checks if this CSV file equals another.

The comparison checks the content.

Overrides:
equals in class Object
Parameters:
obj - the other file, null returns false
Returns:
true if equal
• #### hashCode

public int hashCode()
Returns a suitable hash code for the CSV file.
Overrides:
hashCode in class Object
Returns:
the hash code
• #### toString

public String toString()
Returns a string describing the CSV file.
Overrides:
toString in class Object
Returns:
the descriptive string