Class CsvFile


  • public final class CsvFile
    extends java.lang.Object
    A CSV file.

    Represents a CSV file together with the ability to parse it from a CharSource. The separator may be specified, allowing TSV files (tab-separated) and other similar formats to be parsed.

    This class loads the entire CSV file into memory. To process the CSV file row-by-row, use CsvIterator.

    The CSV file format is a general-purpose comma-separated value format. The format is parsed line-by-line, with lines separated by CR, LF or CRLF. Each line can contain one or more fields. Each field is separated by a comma character (,) or tab. Any field may be quoted using a double quote at the start and end. A quoted field may additionally be prefixed by an equals sign. The content of a quoted field may include commas and additional double quotes. Two adjacent double quotes in a quoted field will be replaced by a single double quote. Quoted fields are not trimmed. Non-quoted fields are trimmed.

    The first line may be treated as a header row. The header row is accessed separately from the data rows.

    Blank lines are ignored. Lines may be commented with has '#' or semicolon ';'.

    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean containsHeader​(java.lang.String header)
      Checks if the header is known.
      boolean containsHeader​(java.util.regex.Pattern headerPattern)
      Checks if the header pattern is known.
      boolean equals​(java.lang.Object obj)
      Checks if this CSV file equals another.
      int hashCode()
      Returns a suitable hash code for the CSV file.
      com.google.common.collect.ImmutableList<java.lang.String> headers()
      Gets the header row.
      static CsvFile of​(com.google.common.io.CharSource source, boolean headerRow)
      Parses the specified source as a CSV file, using a comma as the separator.
      static CsvFile of​(com.google.common.io.CharSource source, boolean headerRow, char separator)
      Parses the specified source as a CSV file where the separator is specified and might not be a comma.
      static CsvFile of​(java.io.Reader reader, boolean headerRow)
      Parses the specified reader as a CSV file, using a comma as the separator.
      static CsvFile of​(java.io.Reader reader, boolean headerRow, char separator)
      Parses the specified reader as a CSV file where the separator is specified and might not be a comma.
      static CsvFile of​(java.util.List<java.lang.String> headers, java.util.List<? extends java.util.List<java.lang.String>> rows)
      Obtains an instance from a list of headers and rows.
      CsvRow row​(int index)
      Gets a single row.
      int rowCount()
      Gets the number of data rows.
      com.google.common.collect.ImmutableList<CsvRow> rows()
      Gets all data rows in the file.
      java.lang.String toString()
      Returns a string describing the CSV file.
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, wait, wait, wait
    • Method Detail

      • of

        public static CsvFile of​(com.google.common.io.CharSource source,
                                 boolean headerRow)
        Parses the specified source as a CSV file, using a comma as the separator.

        CSV files sometimes contain a Unicode Byte Order Mark. Callers are responsible for handling this, such as by using UnicodeBom.

        Parameters:
        source - the CSV file resource
        headerRow - whether the source has a header row, an empty source must still contain the header
        Returns:
        the CSV file
        Throws:
        java.io.UncheckedIOException - if an IO exception occurs
        java.lang.IllegalArgumentException - if the file cannot be parsed
      • of

        public static CsvFile of​(com.google.common.io.CharSource source,
                                 boolean headerRow,
                                 char separator)
        Parses the specified source as a CSV file where the separator is specified and might not be a comma.

        This overload allows the separator to be controlled. For example, a tab-separated file is very similar to a CSV file, the only difference is the separator.

        CSV files sometimes contain a Unicode Byte Order Mark. Callers are responsible for handling this, such as by using UnicodeBom.

        Parameters:
        source - the file resource
        headerRow - whether the source has a header row, an empty source must still contain the header
        separator - the separator used to separate each field, typically a comma, but a tab is sometimes used
        Returns:
        the CSV file
        Throws:
        java.io.UncheckedIOException - if an IO exception occurs
        java.lang.IllegalArgumentException - if the file cannot be parsed
      • of

        public static CsvFile of​(java.io.Reader reader,
                                 boolean headerRow)
        Parses the specified reader as a CSV file, using a comma as the separator.

        This factory method takes a Reader. Callers are encouraged to use CharSource instead of Reader as it allows the resource to be safely managed.

        This factory method allows the separator to be controlled. For example, a tab-separated file is very similar to a CSV file, the only difference is the separator.

        CSV files sometimes contain a Unicode Byte Order Mark. Callers are responsible for handling this, such as by using UnicodeBom.

        Parameters:
        reader - the file resource
        headerRow - whether the source has a header row, an empty source must still contain the header
        Returns:
        the CSV file
        Throws:
        java.io.UncheckedIOException - if an IO exception occurs
        java.lang.IllegalArgumentException - if the file cannot be parsed
      • of

        public static CsvFile of​(java.io.Reader reader,
                                 boolean headerRow,
                                 char separator)
        Parses the specified reader as a CSV file where the separator is specified and might not be a comma.

        This factory method takes a Reader. Callers are encouraged to use CharSource instead of Reader as it allows the resource to be safely managed.

        This factory method allows the separator to be controlled. For example, a tab-separated file is very similar to a CSV file, the only difference is the separator.

        CSV files sometimes contain a Unicode Byte Order Mark. Callers are responsible for handling this, such as by using UnicodeBom.

        Parameters:
        reader - the file resource
        headerRow - whether the source has a header row, an empty source must still contain the header
        separator - the separator used to separate each field, typically a comma, but a tab is sometimes used
        Returns:
        the CSV file
        Throws:
        java.io.UncheckedIOException - if an IO exception occurs
        java.lang.IllegalArgumentException - if the file cannot be parsed
      • of

        public static CsvFile of​(java.util.List<java.lang.String> headers,
                                 java.util.List<? extends java.util.List<java.lang.String>> rows)
        Obtains an instance from a list of headers and rows.

        The headers may be an empty list. All the rows must contain a list of the same size, matching the header if present.

        Parameters:
        headers - the headers, empty if no headers
        rows - the data rows
        Returns:
        the CSV file
        Throws:
        java.lang.IllegalArgumentException - if the rows do not match the headers
      • headers

        public com.google.common.collect.ImmutableList<java.lang.String> headers()
        Gets the header row.

        If there is no header row, an empty list is returned.

        Returns:
        the header row
      • rows

        public com.google.common.collect.ImmutableList<CsvRow> rows()
        Gets all data rows in the file.
        Returns:
        the data rows
      • rowCount

        public int rowCount()
        Gets the number of data rows.
        Returns:
        the number of data rows
      • row

        public CsvRow row​(int index)
        Gets a single row.
        Parameters:
        index - the row index, zero-based
        Returns:
        the row
      • containsHeader

        public boolean containsHeader​(java.lang.String header)
        Checks if the header is known.

        Matching is case insensitive.

        Parameters:
        header - the column header to match
        Returns:
        true if the header is known
      • containsHeader

        public boolean containsHeader​(java.util.regex.Pattern headerPattern)
        Checks if the header pattern is known.

        Matching is case insensitive.

        Parameters:
        headerPattern - the header pattern to match
        Returns:
        true if the header is known
      • equals

        public boolean equals​(java.lang.Object obj)
        Checks if this CSV file equals another.

        The comparison checks the content.

        Overrides:
        equals in class java.lang.Object
        Parameters:
        obj - the other file, null returns false
        Returns:
        true if equal
      • hashCode

        public int hashCode()
        Returns a suitable hash code for the CSV file.
        Overrides:
        hashCode in class java.lang.Object
        Returns:
        the hash code
      • toString

        public java.lang.String toString()
        Returns a string describing the CSV file.
        Overrides:
        toString in class java.lang.Object
        Returns:
        the descriptive string