Class Document
The Document class serves as the top-level container for an XML document, maintaining the document element along with document-level metadata such as XML declarations, DOCTYPE declarations, and encoding information. It preserves the exact formatting of these elements during round-trip parsing and serialization.
Document Properties:
- XML Declaration - Maintains original XML declaration formatting
- DOCTYPE Support - Preserves DOCTYPE declarations exactly as written
- Encoding - Tracks document encoding information
- Version - Maintains XML version information
- Standalone Flag - Preserves standalone document declarations
Usage Examples:
// Create documents using factory methods
Document doc = Document.of(); // Empty document
Document parsed = Document.of(xmlString); // Parse XML from String
Document fromStream = Document.of(inputStream); // Parse XML from InputStream
Document fromFile = Document.of(Paths.get("config.xml")); // Parse XML from file
Document withDecl = Document.withXmlDeclaration("1.0", "UTF-8");
Document complete = Document.withRootElement("project");
// Set the root element
Element root = Element.of("root");
doc.root(root);
// Access document properties
String encoding = doc.encoding(); // "UTF-8"
String version = doc.version(); // "1.0"
// Complex documents using fluent API
Document complex = Document.of()
.version("1.1")
.encoding("UTF-8")
.standalone(true)
.root(Element.of("project"))
.withXmlDeclaration();
Document Structure:
A Document can contain:
- Exactly one document element (root element)
- Zero or more comments and processing instructions
- Whitespace between top-level nodes
- An optional XML declaration
- An optional DOCTYPE declaration
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class Node
Node.NodeType -
Field Summary
Fields inherited from class ContainerNode
childrenFields inherited from class Node
modified, parent, precedingWhitespace -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionaccept(DomTripVisitor visitor) Accepts a visitor for depth-first tree traversal of the entire document.bom(boolean bom) Sets whether a Byte Order Mark (BOM) should be written when serializing to an OutputStream.clone()Deprecated.copy()Creates a deep copy of this node.doctype()Gets the DOCTYPE declaration for this document.Sets the DOCTYPE declaration for this document.Gets the whitespace before the DOCTYPE declaration.encoding()Gets the character encoding for this document.Set the document's character encoding used for serialization.Creates a minimal XML declaration based on current document settings.booleanhasBom()Returns whether this document had a Byte Order Mark (BOM) when it was parsed.booleanGets the standalone flag for this document.static DocumentCreates a minimal document with just a root element (no XML declaration).static Documentof()Creates an empty document with default settings.static Documentof(InputStream inputStream) Creates a document by parsing XML from an InputStream with automatic encoding detection.static Documentof(InputStream inputStream, String defaultEncoding) Creates a document by parsing XML from an InputStream with encoding detection and fallback.static Documentof(InputStream inputStream, Charset defaultCharset) Creates a document by parsing XML from an InputStream with encoding detection and fallback.static DocumentCreates a document by parsing the provided XML string.static DocumentCreates a document by parsing XML from a file path with automatic encoding detection.parent(ContainerNode parent) Sets the parent container node of this node.parseFragment(String xml) Parses an XML fragment into a list of nodes.root()Gets the root element of this document.Sets the root element of this document.standalone(boolean standalone) Sets the standalone flag for this document.toString()Returns a string representation of this document for debugging purposes.voidtoXml(OutputStream outputStream) Serializes this document to an OutputStream using the document's encoding.voidtoXml(OutputStream outputStream, String encoding) Serializes this document to an OutputStream using the specified encoding.voidtoXml(OutputStream outputStream, Charset charset) Serializes this document to an OutputStream using the specified charset.voidtoXml(StringBuilder sb) Serializes this document to XML, appending to the provided StringBuilder.type()Returns the node type for this document.version()Gets the XML version for this document.Set the XML version of this document.static DocumentwithDoctype(String version, String encoding, String doctype) Creates a document with XML declaration and DOCTYPE.static DocumentwithRootElement(String rootElementName) Creates a document with a root element and XML declaration.Generates and sets an XML declaration based on current document settings.static DocumentwithXmlDeclaration(String version, String encoding) Creates a document with XML declaration.static DocumentwithXmlDeclaration(String version, String encoding, boolean standalone) Creates a document with XML declaration and standalone attribute.Gets the XML declaration string for this document.xmlDeclaration(String xmlDeclaration) Sets the XML declaration for this document.Methods inherited from class ContainerNode
addChild, child, childCount, children, clearChildren, clearModified, findTextNode, firstChild, getNode, hasChildElements, hasTextContent, insertChild, insertChildAfter, insertChildBefore, isEmpty, lastChild, removeChild, replaceChild, textContentMethods inherited from class Node
depth, document, isDescendantOf, isModified, markModified, nextSibling, nextSiblingElement, parent, parentElement, precedingWhitespace, precedingWhitespace, previousSibling, previousSiblingElement, siblingIndex, toXml
-
Constructor Details
-
Document
public Document()Creates a new empty XML document with default settings.Initializes the document with UTF-8 encoding, XML version 1.0, and standalone set to false. The XML declaration and DOCTYPE are initially empty.
-
-
Method Details
-
parent
Sets the parent container node of this node.This method is typically called automatically when adding nodes to containers. Manual use should be done carefully to maintain tree consistency.
-
type
Returns the node type for this document.- Specified by:
typein classNode- Returns:
Node.NodeType.DOCUMENT
-
xmlDeclaration
Gets the XML declaration string for this document.The XML declaration typically contains version, encoding, and standalone information, formatted as:
<?xml version="1.0" encoding="UTF-8"?>- Returns:
- the XML declaration string, or empty string if none is set
- See Also:
-
xmlDeclaration
Sets the XML declaration for this document.The XML declaration should be a complete declaration including the opening
<?xmland closing?>tags. Setting this value marks the document as modified.Example:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>- Parameters:
xmlDeclaration- the XML declaration string, or null to clear it- Returns:
- this document for method chaining
- See Also:
-
doctype
Gets the DOCTYPE declaration for this document.The DOCTYPE declaration defines the document type and may include references to external DTD files or inline DTD definitions.
- Returns:
- the DOCTYPE declaration string, or empty string if none is set
- See Also:
-
doctype
Sets the DOCTYPE declaration for this document.The DOCTYPE declaration should be a complete declaration including the opening
<!DOCTYPEand closing>tags. Setting this value marks the document as modified.Example:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">- Parameters:
doctype- the DOCTYPE declaration string, or null to clear it- Returns:
- this document for method chaining
- See Also:
-
doctypePrecedingWhitespace
Gets the whitespace before the DOCTYPE declaration.This whitespace appears between the XML declaration and the DOCTYPE declaration. It is preserved during round-trip parsing and serialization to maintain document fidelity.
- Returns:
- the whitespace before the DOCTYPE declaration, or empty string if none
-
root
Gets the root element of this document.The document element is the top-level element that contains all other elements in the document. Every well-formed XML document must have exactly one document element.
- Returns:
- the root element, or null if none is set
- See Also:
-
root
Sets the root element of this document.The document element becomes the top-level element containing all other elements. Setting this value marks the document as modified and establishes the parent-child relationship.
- Parameters:
root- the element to set as the document root, or null to clear it- Returns:
- this document for method chaining
- See Also:
-
encoding
Gets the character encoding for this document.The encoding specifies how the document's characters are encoded. Common values include "UTF-8", "UTF-16", "ISO-8859-1", etc.
- Returns:
- the document encoding, defaults to "UTF-8"
- See Also:
-
encoding
Set the document's character encoding used for serialization.If
encodingisnull, the default"UTF-8"is used. This method marks the document as modified.- Parameters:
encoding- the character encoding to use, ornullto reset to the default"UTF-8"- Returns:
- this document for method chaining
- See Also:
-
version
Gets the XML version for this document.The XML version indicates which version of the XML specification this document conforms to. Common values are "1.0" and "1.1".
- Returns:
- the XML version, defaults to "1.0"
- See Also:
-
version
-
isStandalone
public boolean isStandalone()Gets the standalone flag for this document.The standalone flag indicates whether the document is self-contained or depends on external markup declarations. When true, the document declares that it has no external dependencies.
- Returns:
- true if the document is standalone, false otherwise
- See Also:
-
standalone
Sets the standalone flag for this document.Setting this value marks the document as modified. The standalone flag affects the XML declaration output.
- Parameters:
standalone- true if the document is standalone, false otherwise- Returns:
- this document for method chaining
- See Also:
-
hasBom
public boolean hasBom()Returns whether this document had a Byte Order Mark (BOM) when it was parsed.When true, the BOM will be written back when serializing to an OutputStream via
toXml(OutputStream),toXml(OutputStream, Charset), ortoXml(OutputStream, String). The BOM is never included in the string output fromNode.toXml().- Returns:
- true if the document had a BOM, false otherwise
- Since:
- 1.0.0
- See Also:
-
bom
Sets whether a Byte Order Mark (BOM) should be written when serializing to an OutputStream.- Parameters:
bom- true to write a BOM, false otherwise- Returns:
- this document for method chaining
- Since:
- 1.0.0
- See Also:
-
toXml
Serializes this document to XML, appending to the provided StringBuilder.This method preserves the original formatting including XML declaration, DOCTYPE declaration, whitespace, and all child nodes. The output includes:
- XML declaration (if present)
- DOCTYPE declaration (if present)
- Preceding whitespace
- All child nodes (comments, processing instructions, elements)
- Document element (if not already included in children)
- Following whitespace
-
accept
Accepts a visitor for depth-first tree traversal of the entire document.Visits all children of the document (comments, processing instructions, and the root element) in document order.
- Specified by:
acceptin classNode- Parameters:
visitor- the visitor to accept- Returns:
- the action indicating how traversal should proceed
- Throws:
IllegalArgumentException- if visitor is null- Since:
- 1.3.0
- See Also:
-
toXml
Serializes this document to an OutputStream using the document's encoding.This method uses the document's encoding property to determine the character encoding for the output stream. If the document has no encoding specified, UTF-8 is used as the default.
- Parameters:
outputStream- the OutputStream to write to- Throws:
DomTripException- if serialization fails or I/O errors occur
-
toXml
Serializes this document to an OutputStream using the specified charset.This method allows explicit control over the character encoding used for serialization, regardless of the document's encoding property.
- Parameters:
outputStream- the OutputStream to write tocharset- the character encoding to use- Throws:
DomTripException- if serialization fails or I/O errors occur
-
toXml
Serializes this document to an OutputStream using the specified encoding.This method allows explicit control over the character encoding used for serialization, regardless of the document's encoding property.
- Parameters:
outputStream- the OutputStream to write toencoding- the character encoding name to use- Throws:
DomTripException- if serialization fails or I/O errors occur
-
generateXmlDeclaration
Creates a minimal XML declaration based on current document settings.Generates an XML declaration using the current version, encoding, and standalone settings. The declaration follows the standard format:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>The standalone attribute is only included if the standalone flag is true.
- Returns:
- a properly formatted XML declaration string
- See Also:
-
toString
-
of
Creates an empty document with default settings.- Returns:
- a new empty Document
-
of
Creates a document by parsing the provided XML string.This is a convenience method that combines document creation and XML parsing in a single call. It uses the default parser configuration.
- Parameters:
xml- the XML string to parse- Returns:
- a new Document containing the parsed XML
- Throws:
DomTripException- if the XML is malformed or cannot be parsed
-
parseFragment
Parses an XML fragment into a list of nodes.This method parses an XML fragment that may contain multiple root-level elements, comments, processing instructions, and text nodes. Unlike
of(String), which expects a well-formed XML document, this method handles fragments that don't have a single root element.Usage Examples:
// Parse a fragment with multiple elements List<Node> nodes = Document.parseFragment("<foo>bar</foo><bar>baz</bar>"); // Parse a fragment with comments and elements List<Node> nodes = Document.parseFragment( "<!-- comment -->\n<foo>bar</foo>\n<bar>baz</bar>");- Parameters:
xml- the XML fragment string to parse- Returns:
- a list of parsed nodes
- Throws:
DomTripException- if the XML fragment is malformed
-
of
Creates a document by parsing XML from an InputStream with automatic encoding detection.This method automatically detects the character encoding by:
- Checking for a Byte Order Mark (BOM)
- Reading the XML declaration to extract the encoding attribute
- Falling back to UTF-8 if no encoding is specified
The resulting Document will have its encoding property set to the detected or declared encoding.
- Parameters:
inputStream- the InputStream containing XML data- Returns:
- a new Document containing the parsed XML with preserved formatting
- Throws:
DomTripException- if the XML is malformed, cannot be parsed, or I/O errors occur
-
of
Creates a document by parsing XML from an InputStream with encoding detection and fallback.This method attempts to detect the character encoding by:
- Checking for a Byte Order Mark (BOM)
- Reading the XML declaration to extract the encoding attribute
- Using the provided default charset if detection fails
The resulting Document will have its encoding property set to the detected, declared, or default encoding.
- Parameters:
inputStream- the InputStream containing XML datadefaultCharset- the charset to use if detection fails- Returns:
- a new Document containing the parsed XML with preserved formatting
- Throws:
DomTripException- if the XML is malformed, cannot be parsed, or I/O errors occur
-
of
Creates a document by parsing XML from an InputStream with encoding detection and fallback.This method attempts to detect the character encoding by:
- Checking for a Byte Order Mark (BOM)
- Reading the XML declaration to extract the encoding attribute
- Using the provided default encoding if detection fails
The resulting Document will have its encoding property set to the detected, declared, or default encoding.
- Parameters:
inputStream- the InputStream containing XML datadefaultEncoding- the encoding name to use if detection fails- Returns:
- a new Document containing the parsed XML with preserved formatting
- Throws:
DomTripException- if the XML is malformed, cannot be parsed, or I/O errors occur
-
of
Creates a document by parsing XML from a file path with automatic encoding detection.This is a convenience method that combines file reading and XML parsing in a single call. It leverages the InputStream-based parsing with automatic encoding detection to properly handle various character encodings.
The method automatically detects the character encoding by:
- Checking for a Byte Order Mark (BOM)
- Reading the XML declaration to extract the encoding attribute
- Falling back to UTF-8 if no encoding is specified
This method provides the most robust way to parse XML files as it properly handles character encoding detection and avoids potential encoding issues.
Usage Examples:
// Parse XML file with automatic encoding detection Document doc = Document.of(Paths.get("config.xml")); // Works with various encodings Document utf8Doc = Document.of(Paths.get("utf8-file.xml")); Document utf16Doc = Document.of(Paths.get("utf16-file.xml")); Document isoDoc = Document.of(Paths.get("iso-8859-1-file.xml")); // Use with try-with-resources for proper resource management try { Document doc = Document.of(configPath); Editor editor = new Editor(doc); // ... edit document } catch (DomTripException e) { System.err.println("Failed to parse XML: " + e.getMessage()); }- Parameters:
path- the path to the XML file to parse- Returns:
- a new Document containing the parsed XML with preserved formatting
- Throws:
DomTripException- if the file cannot be read, the XML is malformed, or cannot be parsed- See Also:
-
copy
Creates a deep copy of this node.The copied node will have:
- All properties copied from the original
- All child nodes recursively copied (for container nodes)
- Whitespace and formatting properties preserved
- No parent (parent is set to null)
The copied node and its descendants will have their parent-child relationships properly established within the copied subtree.
-
clone
Deprecated.Usecopy()instead.Creates a deep copy of this document. -
withXmlDeclaration
Creates a document with XML declaration.Creates a document with the specified version and encoding, automatically generating an appropriate XML declaration.
- Parameters:
version- the XML version (e.g., "1.0", "1.1"), or null for default "1.0"encoding- the character encoding (e.g., "UTF-8"), or null for default "UTF-8"- Returns:
- a new Document with XML declaration
-
withXmlDeclaration
Creates a document with XML declaration and standalone attribute.Creates a document with the specified version, encoding, and standalone flag, automatically generating an appropriate XML declaration.
- Parameters:
version- the XML version, or null for default "1.0"encoding- the character encoding, or null for default "UTF-8"standalone- true if the document is standalone, false otherwise- Returns:
- a new Document with XML declaration and standalone attribute
-
withRootElement
Creates a document with a root element and XML declaration.Creates a complete document with XML declaration (version 1.0, UTF-8 encoding) and the specified root element.
- Parameters:
rootElementName- the name of the root element- Returns:
- a new Document with XML declaration and root element
- Throws:
DomTripException
-
withDoctype
Creates a document with XML declaration and DOCTYPE.Creates a document with the specified version, encoding, and DOCTYPE declaration, automatically generating an appropriate XML declaration.
- Parameters:
version- the XML version, or null for default "1.0"encoding- the character encoding, or null for default "UTF-8"doctype- the DOCTYPE declaration string- Returns:
- a new Document with XML declaration and DOCTYPE
-
minimal
Creates a minimal document with just a root element (no XML declaration).Creates a simple document containing only the specified root element, without any XML declaration or DOCTYPE.
- Parameters:
rootElementName- the name of the root element- Returns:
- a new minimal Document with only a root element
- Throws:
DomTripException
-
withXmlDeclaration
Generates and sets an XML declaration based on current document settings.The XML declaration will include the version, encoding, and standalone flag (if true) based on the current document configuration.
- Returns:
- this document for method chaining
-
copy()instead.