Class XmlFile


  • public final class XmlFile
    extends Object
    An XML file.

    Represents an XML file together with the ability to parse it from a ByteSource.

    This uses the standard StAX API to parse the file. Once parsed, the XML is represented as a DOM-like structure, see XmlElement. This approach is suitable for XML files where the size of the parsed XML file is known to be manageable in memory.

    Note that the XmlElement representation does not express all XML features. No support is provided for processing instructions, comments or mixed content. In addition, it is not possible to determine the difference between empty content and no children.

    There is no support for namespaces. All namespace prefixes are dropped. There are cases where this can be a problem, but most of the time lenient parsing is helpful.

    • Method Detail

      • of

        public static XmlFile of​(ByteSource source)
        Parses the specified source as an XML file to an in-memory DOM-like structure.

        This parses the specified byte source expecting an XML file format. The resulting instance can be queried for the root element.

        Parameters:
        source - the XML source data
        Returns:
        the parsed file
        Throws:
        UncheckedIOException - if an IO exception occurs
        IllegalArgumentException - if the file cannot be parsed
      • of

        public static XmlFile of​(ByteSource source,
                                 String refAttrName)
        Parses the specified source as an XML file to an in-memory DOM-like structure.

        This parses the specified byte source expecting an XML file format. The resulting instance can be queried for the root element.

        This supports capturing attribute references, such as an id/href pair. Wherever the parser finds an attribute with the specified name, the element is added to the internal map, accessible by calling getReferences().

        For example, if one part of the XML has <foo id="fooId">, the references map will contain an entry mapping "fooId" to the parsed element <foo>.

        Parameters:
        source - the XML source data
        refAttrName - the attribute name that should be parsed as a reference
        Returns:
        the parsed file
        Throws:
        UncheckedIOException - if an IO exception occurs
        IllegalArgumentException - if the file cannot be parsed
      • parseElements

        public static XmlElement parseElements​(ByteSource source,
                                               ToIntFunction<String> filterFn)
        Parses the element names and structure from the specified XML, filtering to reduce memory usage.

        This parses the specified byte source expecting an XML file format. The filter function takes the element name and decides how many child levels should be returned in the response. Always returning Integer.MAX_VALUE will not filter the children. For example, a function could check if the name is "trade" and return only the immediate children by returning 1.

        Parameters:
        source - the XML source data
        filterFn - the filter function to use
        Returns:
        the parsed file
        Throws:
        UncheckedIOException - if an IO exception occurs
        IllegalArgumentException - if the file cannot be parsed
      • getRoot

        public XmlElement getRoot()
        Gets the root element of this file.
        Returns:
        the root element
      • getReferences

        public ImmutableMap<String,​XmlElement> getReferences()
        Gets the reference map of id to element.

        This is used to decode references, such as an id/href pair.

        For example, if one part of the XML has <foo id="fooId">, the map will contain an entry mapping "fooId" to the parsed element <foo>.

        Returns:
        the map of id to element
      • equals

        public boolean equals​(Object obj)
        Checks if this file equals another.

        The comparison checks the content and reference map.

        Overrides:
        equals in class Object
        Parameters:
        obj - the other section, null returns false
        Returns:
        true if equal
      • hashCode

        public int hashCode()
        Returns a suitable hash code for the file.
        Overrides:
        hashCode in class Object
        Returns:
        the hash code
      • toString

        public String toString()
        Returns a string describing the file.
        Overrides:
        toString in class Object
        Returns:
        the descriptive string