Since PHP 8.4, the new DOM API offers standards-compliant support for parsing HTML5 documents, resolves several longstanding compliance issues in DOM functionality, and provides functions to simplify working with documents.
Using the old DOMDocument
class, working with HTML documents involved a combination of DOM and XPath operations. For example, to extract elements with specific attributes, you would first load the HTML into a DOMDocument
instance using the loadHTML
method. Next, an DOMXPath
object was created to perform queries against the DOM tree.
<?php
$html = '<main>
<div><label>Name</label><span class="name">John</span></div>
<div><label>Name</label><span class="name">Patrick</span></div>
</main>';
$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_NOERROR);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//span[@class="name"]');
echo $nodes[0]->nodeValue; // John
Using the new Dom\HTMLDocument
class introduced in PHP 8.4, working with HTML documents becomes much more streamlined and intuitive. HTML5 compliance ensures that documents are parsed correctly. The createFromString
method simplifies loading HTML content directly, while the built-in querySelectorAll
method provides a modern and user-friendly way to select elements using CSS selectors. This eliminates the need for verbose XPath expressions and makes the code more readable and maintainable.
<?php
use Dom\HTMLDocument;
$html = '<main>
<div><label>Name</label><span class="name">John</span></div>
<div><label>Name</label><span class="name">Patrick</span></div>
</main>';
$dom = HTMLDocument::createFromString($html, LIBXML_NOERROR);
$nodes = $dom->querySelectorAll('.name');
echo $nodes[0]->textContent; // John
Leave a Comment
Cancel reply