Class Document


  • public class Document
    extends ContainerNode
    Represents the root of an XML document, containing the document element and preserving document-level formatting like XML declarations and DTDs.

    The Document class serves as the top-level container for an XML document, maintaining the document element along with document-level metadata such as XML declarations, DOCTYPE declarations, and encoding information. It preserves the exact formatting of these elements during round-trip parsing and serialization.

    Document Properties:

    • XML Declaration - Maintains original XML declaration formatting
    • DOCTYPE Support - Preserves DOCTYPE declarations exactly as written
    • Encoding - Tracks document encoding information
    • Version - Maintains XML version information
    • Standalone Flag - Preserves standalone document declarations

    Usage Examples:

    
     // Create documents using factory methods
     Document doc = Document.of(); // Empty document
     Document parsed = Document.of(xmlString); // Parse XML from String
     Document fromStream = Document.of(inputStream); // Parse XML from InputStream
     Document fromFile = Document.of(Paths.get("config.xml")); // Parse XML from file
     Document withDecl = Document.withXmlDeclaration("1.0", "UTF-8");
     Document complete = Document.withRootElement("project");
    
     // Set the root element
     Element root = Element.of("root");
     doc.root(root);
    
     // Access document properties
     String encoding = doc.encoding(); // "UTF-8"
     String version = doc.version();   // "1.0"
    
     // Complex documents using fluent API
     Document complex = Document.of()
         .version("1.1")
         .encoding("UTF-8")
         .standalone(true)
         .root(Element.of("project"))
         .withXmlDeclaration();
     

    Document Structure:

    A Document can contain:

    • Exactly one document element (root element)
    • Zero or more comments and processing instructions
    • Whitespace between top-level nodes
    • An optional XML declaration
    • An optional DOCTYPE declaration
    See Also:
    Element, ContainerNode, Parser
    • Constructor Detail

      • Document

        public Document()
        Creates a new empty XML document with default settings.

        Initializes the document with UTF-8 encoding, XML version 1.0, and standalone set to false. The XML declaration and DOCTYPE are initially empty.

    • Method Detail

      • parent

        public Document parent​(ContainerNode parent)
        Sets the parent container node of this node.

        This method is typically called automatically when adding nodes to containers. Manual use should be done carefully to maintain tree consistency.

        Overrides:
        parent in class Node
        Parameters:
        parent - the parent container node to set, or null to clear the parent
        Returns:
        this document for method chaining
        See Also:
        Node.parent()
      • xmlDeclaration

        public java.lang.String xmlDeclaration()
        Gets the XML declaration string for this document.

        The XML declaration typically contains version, encoding, and standalone information, formatted as: <?xml version="1.0" encoding="UTF-8"?>

        Returns:
        the XML declaration string, or empty string if none is set
        See Also:
        xmlDeclaration(String)
      • xmlDeclaration

        public Document xmlDeclaration​(java.lang.String xmlDeclaration)
        Sets the XML declaration for this document.

        The XML declaration should be a complete declaration including the opening <?xml and closing ?> tags. Setting this value marks the document as modified.

        Example: <?xml version="1.0" encoding="UTF-8" standalone="yes"?>

        Parameters:
        xmlDeclaration - the XML declaration string, or null to clear it
        Returns:
        this document for method chaining
        See Also:
        xmlDeclaration()
      • doctype

        public java.lang.String doctype()
        Gets the DOCTYPE declaration for this document.

        The DOCTYPE declaration defines the document type and may include references to external DTD files or inline DTD definitions.

        Returns:
        the DOCTYPE declaration string, or empty string if none is set
        See Also:
        doctype(String)
      • doctype

        public Document doctype​(java.lang.String doctype)
        Sets the DOCTYPE declaration for this document.

        The DOCTYPE declaration should be a complete declaration including the opening <!DOCTYPE and closing > tags. Setting this value marks the document as modified.

        Example: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

        Parameters:
        doctype - the DOCTYPE declaration string, or null to clear it
        Returns:
        this document for method chaining
        See Also:
        doctype()
      • doctypePrecedingWhitespace

        public java.lang.String doctypePrecedingWhitespace()
        Gets the whitespace before the DOCTYPE declaration.

        This whitespace appears between the XML declaration and the DOCTYPE declaration. It is preserved during round-trip parsing and serialization to maintain document fidelity.

        Returns:
        the whitespace before the DOCTYPE declaration, or empty string if none
      • root

        public Element root()
        Gets the root element of this document.

        The document element is the top-level element that contains all other elements in the document. Every well-formed XML document must have exactly one document element.

        Returns:
        the root element, or null if none is set
        See Also:
        root(Element)
      • root

        public Document root​(Element root)
        Sets the root element of this document.

        The document element becomes the top-level element containing all other elements. Setting this value marks the document as modified and establishes the parent-child relationship.

        Parameters:
        root - the element to set as the document root, or null to clear it
        Returns:
        this document for method chaining
        See Also:
        root(), ContainerNode.addChild(Node)
      • encoding

        public java.lang.String encoding()
        Gets the character encoding for this document.

        The encoding specifies how the document's characters are encoded. Common values include "UTF-8", "UTF-16", "ISO-8859-1", etc.

        Returns:
        the document encoding, defaults to "UTF-8"
        See Also:
        encoding(String)
      • encoding

        public Document encoding​(java.lang.String encoding)
        Set the document's character encoding used for serialization.

        If encoding is null, the default "UTF-8" is used. This method marks the document as modified.

        Parameters:
        encoding - the character encoding to use, or null to reset to the default "UTF-8"
        Returns:
        this document for method chaining
        See Also:
        encoding()
      • version

        public java.lang.String version()
        Gets the XML version for this document.

        The XML version indicates which version of the XML specification this document conforms to. Common values are "1.0" and "1.1".

        Returns:
        the XML version, defaults to "1.0"
        See Also:
        version(String)
      • version

        public Document version​(java.lang.String version)
        Set the XML version of this document.

        Marks the document as modified.

        Parameters:
        version - the XML version to use, or null to use "1.0"
        Returns:
        this document
        See Also:
        version()
      • isStandalone

        public boolean isStandalone()
        Gets the standalone flag for this document.

        The standalone flag indicates whether the document is self-contained or depends on external markup declarations. When true, the document declares that it has no external dependencies.

        Returns:
        true if the document is standalone, false otherwise
        See Also:
        standalone(boolean)
      • standalone

        public Document standalone​(boolean standalone)
        Sets the standalone flag for this document.

        Setting this value marks the document as modified. The standalone flag affects the XML declaration output.

        Parameters:
        standalone - true if the document is standalone, false otherwise
        Returns:
        this document for method chaining
        See Also:
        isStandalone()
      • bom

        public Document bom​(boolean bom)
        Sets whether a Byte Order Mark (BOM) should be written when serializing to an OutputStream.
        Parameters:
        bom - true to write a BOM, false otherwise
        Returns:
        this document for method chaining
        Since:
        1.0.0
        See Also:
        hasBom()
      • toXml

        public void toXml​(java.lang.StringBuilder sb)
        Serializes this document to XML, appending to the provided StringBuilder.

        This method preserves the original formatting including XML declaration, DOCTYPE declaration, whitespace, and all child nodes. The output includes:

        • XML declaration (if present)
        • DOCTYPE declaration (if present)
        • Preceding whitespace
        • All child nodes (comments, processing instructions, elements)
        • Document element (if not already included in children)
        • Following whitespace
        Specified by:
        toXml in class Node
        Parameters:
        sb - the StringBuilder to append the XML content to
        See Also:
        Node.toXml()
      • accept

        public DomTripVisitor.Action accept​(DomTripVisitor visitor)
        Accepts a visitor for depth-first tree traversal of the entire document.

        Visits all children of the document (comments, processing instructions, and the root element) in document order.

        Specified by:
        accept in class Node
        Parameters:
        visitor - the visitor to accept
        Returns:
        the action indicating how traversal should proceed
        Throws:
        java.lang.IllegalArgumentException - if visitor is null
        Since:
        1.3.0
        See Also:
        DomTripVisitor
      • toXml

        public void toXml​(java.io.OutputStream outputStream)
                   throws DomTripException
        Serializes this document to an OutputStream using the document's encoding.

        This method uses the document's encoding property to determine the character encoding for the output stream. If the document has no encoding specified, UTF-8 is used as the default.

        Parameters:
        outputStream - the OutputStream to write to
        Throws:
        DomTripException - if serialization fails or I/O errors occur
      • toXml

        public void toXml​(java.io.OutputStream outputStream,
                          java.nio.charset.Charset charset)
                   throws DomTripException
        Serializes this document to an OutputStream using the specified charset.

        This method allows explicit control over the character encoding used for serialization, regardless of the document's encoding property.

        Parameters:
        outputStream - the OutputStream to write to
        charset - the character encoding to use
        Throws:
        DomTripException - if serialization fails or I/O errors occur
      • toXml

        public void toXml​(java.io.OutputStream outputStream,
                          java.lang.String encoding)
                   throws DomTripException
        Serializes this document to an OutputStream using the specified encoding.

        This method allows explicit control over the character encoding used for serialization, regardless of the document's encoding property.

        Parameters:
        outputStream - the OutputStream to write to
        encoding - the character encoding name to use
        Throws:
        DomTripException - if serialization fails or I/O errors occur
      • generateXmlDeclaration

        public java.lang.String generateXmlDeclaration()
        Creates a minimal XML declaration based on current document settings.

        Generates an XML declaration using the current version, encoding, and standalone settings. The declaration follows the standard format:

        <?xml version="1.0" encoding="UTF-8" standalone="yes"?>

        The standalone attribute is only included if the standalone flag is true.

        Returns:
        a properly formatted XML declaration string
        See Also:
        version(), encoding(), isStandalone()
      • toString

        public java.lang.String toString()
        Returns a string representation of this document for debugging purposes.

        The string includes the XML version, encoding, and the name of the document element (if present).

        Overrides:
        toString in class java.lang.Object
        Returns:
        a string representation of this document
      • of

        public static Document of()
        Creates an empty document with default settings.
        Returns:
        a new empty Document
      • of

        public static Document of​(java.lang.String xml)
                           throws DomTripException
        Creates a document by parsing the provided XML string.

        This is a convenience method that combines document creation and XML parsing in a single call. It uses the default parser configuration.

        Parameters:
        xml - the XML string to parse
        Returns:
        a new Document containing the parsed XML
        Throws:
        DomTripException - if the XML is malformed or cannot be parsed
      • parseFragment

        public static java.util.List<Node> parseFragment​(java.lang.String xml)
                                                  throws DomTripException
        Parses an XML fragment into a list of nodes.

        This method parses an XML fragment that may contain multiple root-level elements, comments, processing instructions, and text nodes. Unlike of(String), which expects a well-formed XML document, this method handles fragments that don't have a single root element.

        Usage Examples:

        
         // Parse a fragment with multiple elements
         List<Node> nodes = Document.parseFragment("<foo>bar</foo><bar>baz</bar>");
        
         // Parse a fragment with comments and elements
         List<Node> nodes = Document.parseFragment(
             "<!-- comment -->\n<foo>bar</foo>\n<bar>baz</bar>");
         
        Parameters:
        xml - the XML fragment string to parse
        Returns:
        a list of parsed nodes
        Throws:
        DomTripException - if the XML fragment is malformed
      • of

        public static Document of​(java.io.InputStream inputStream)
                           throws DomTripException
        Creates a document by parsing XML from an InputStream with automatic encoding detection.

        This method automatically detects the character encoding by:

        1. Checking for a Byte Order Mark (BOM)
        2. Reading the XML declaration to extract the encoding attribute
        3. Falling back to UTF-8 if no encoding is specified

        The resulting Document will have its encoding property set to the detected or declared encoding.

        Parameters:
        inputStream - the InputStream containing XML data
        Returns:
        a new Document containing the parsed XML with preserved formatting
        Throws:
        DomTripException - if the XML is malformed, cannot be parsed, or I/O errors occur
      • of

        public static Document of​(java.io.InputStream inputStream,
                                  java.nio.charset.Charset defaultCharset)
                           throws DomTripException
        Creates a document by parsing XML from an InputStream with encoding detection and fallback.

        This method attempts to detect the character encoding by:

        1. Checking for a Byte Order Mark (BOM)
        2. Reading the XML declaration to extract the encoding attribute
        3. Using the provided default charset if detection fails

        The resulting Document will have its encoding property set to the detected, declared, or default encoding.

        Parameters:
        inputStream - the InputStream containing XML data
        defaultCharset - the charset to use if detection fails
        Returns:
        a new Document containing the parsed XML with preserved formatting
        Throws:
        DomTripException - if the XML is malformed, cannot be parsed, or I/O errors occur
      • of

        public static Document of​(java.io.InputStream inputStream,
                                  java.lang.String defaultEncoding)
                           throws DomTripException
        Creates a document by parsing XML from an InputStream with encoding detection and fallback.

        This method attempts to detect the character encoding by:

        1. Checking for a Byte Order Mark (BOM)
        2. Reading the XML declaration to extract the encoding attribute
        3. Using the provided default encoding if detection fails

        The resulting Document will have its encoding property set to the detected, declared, or default encoding.

        Parameters:
        inputStream - the InputStream containing XML data
        defaultEncoding - the encoding name to use if detection fails
        Returns:
        a new Document containing the parsed XML with preserved formatting
        Throws:
        DomTripException - if the XML is malformed, cannot be parsed, or I/O errors occur
      • of

        public static Document of​(java.nio.file.Path path)
                           throws DomTripException
        Creates a document by parsing XML from a file path with automatic encoding detection.

        This is a convenience method that combines file reading and XML parsing in a single call. It leverages the InputStream-based parsing with automatic encoding detection to properly handle various character encodings.

        The method automatically detects the character encoding by:

        1. Checking for a Byte Order Mark (BOM)
        2. Reading the XML declaration to extract the encoding attribute
        3. Falling back to UTF-8 if no encoding is specified

        This method provides the most robust way to parse XML files as it properly handles character encoding detection and avoids potential encoding issues.

        Usage Examples:

        
         // Parse XML file with automatic encoding detection
         Document doc = Document.of(Paths.get("config.xml"));
        
         // Works with various encodings
         Document utf8Doc = Document.of(Paths.get("utf8-file.xml"));
         Document utf16Doc = Document.of(Paths.get("utf16-file.xml"));
         Document isoDoc = Document.of(Paths.get("iso-8859-1-file.xml"));
        
         // Use with try-with-resources for proper resource management
         try {
             Document doc = Document.of(configPath);
             Editor editor = new Editor(doc);
             // ... edit document
         } catch (DomTripException e) {
             System.err.println("Failed to parse XML: " + e.getMessage());
         }
         
        Parameters:
        path - the path to the XML file to parse
        Returns:
        a new Document containing the parsed XML with preserved formatting
        Throws:
        DomTripException - if the file cannot be read, the XML is malformed, or cannot be parsed
        See Also:
        for InputStream-based parsing, for the underlying file reading
      • copy

        public Document copy()
        Creates a deep copy of this node.

        The copied node will have:

        • All properties copied from the original
        • All child nodes recursively copied (for container nodes)
        • Whitespace and formatting properties preserved
        • No parent (parent is set to null)

        The copied node and its descendants will have their parent-child relationships properly established within the copied subtree.

        Specified by:
        copy in class Node
        Returns:
        a new node that is a deep copy of this node
        Since:
        1.1.0
      • clone

        @Deprecated
        public Document clone()
        Deprecated.
        Use copy() instead.
        Creates a deep copy of this document.
        Overrides:
        clone in class Node
        Returns:
        a new document that is a copy of this document
      • withXmlDeclaration

        public static Document withXmlDeclaration​(java.lang.String version,
                                                  java.lang.String encoding)
        Creates a document with XML declaration.

        Creates a document with the specified version and encoding, automatically generating an appropriate XML declaration.

        Parameters:
        version - the XML version (e.g., "1.0", "1.1"), or null for default "1.0"
        encoding - the character encoding (e.g., "UTF-8"), or null for default "UTF-8"
        Returns:
        a new Document with XML declaration
      • withXmlDeclaration

        public static Document withXmlDeclaration​(java.lang.String version,
                                                  java.lang.String encoding,
                                                  boolean standalone)
        Creates a document with XML declaration and standalone attribute.

        Creates a document with the specified version, encoding, and standalone flag, automatically generating an appropriate XML declaration.

        Parameters:
        version - the XML version, or null for default "1.0"
        encoding - the character encoding, or null for default "UTF-8"
        standalone - true if the document is standalone, false otherwise
        Returns:
        a new Document with XML declaration and standalone attribute
      • withRootElement

        public static Document withRootElement​(java.lang.String rootElementName)
                                        throws DomTripException
        Creates a document with a root element and XML declaration.

        Creates a complete document with XML declaration (version 1.0, UTF-8 encoding) and the specified root element.

        Parameters:
        rootElementName - the name of the root element
        Returns:
        a new Document with XML declaration and root element
        Throws:
        DomTripException
      • withDoctype

        public static Document withDoctype​(java.lang.String version,
                                           java.lang.String encoding,
                                           java.lang.String doctype)
        Creates a document with XML declaration and DOCTYPE.

        Creates a document with the specified version, encoding, and DOCTYPE declaration, automatically generating an appropriate XML declaration.

        Parameters:
        version - the XML version, or null for default "1.0"
        encoding - the character encoding, or null for default "UTF-8"
        doctype - the DOCTYPE declaration string
        Returns:
        a new Document with XML declaration and DOCTYPE
      • minimal

        public static Document minimal​(java.lang.String rootElementName)
                                throws DomTripException
        Creates a minimal document with just a root element (no XML declaration).

        Creates a simple document containing only the specified root element, without any XML declaration or DOCTYPE.

        Parameters:
        rootElementName - the name of the root element
        Returns:
        a new minimal Document with only a root element
        Throws:
        DomTripException
      • withXmlDeclaration

        public Document withXmlDeclaration()
        Generates and sets an XML declaration based on current document settings.

        The XML declaration will include the version, encoding, and standalone flag (if true) based on the current document configuration.

        Returns:
        this document for method chaining