PHP provides the Dom\HTMLDocument class for parsing HTML documents, allowing developers to navigate, modify, and extract information using familiar DOM methods. This makes it especially useful for tasks such as web scraping, content analysis, or dynamically transforming markup.
The following example parses an HTML string and retrieves every element that uses the name class.
<?php
use Dom\HTMLDocument;
$html = '<main>
<div><label>Name</label><span class="name">John</span></div>
<div><label>Name</label><span class="name">Patrick</span></div>
</main>';
$dom = HTMLDocument::createFromString($html, LIBXML_NOERROR);
$nodes = $dom->querySelectorAll('.name');
echo $nodes[0]->textContent; // John
Since PHP 8.5, the Dom\HTMLDocument class contains a dedicated getElementsByClassName method, making it easier and more efficient to retrieve elements by their class without relying on more general query methods.
<?php
use Dom\HTMLDocument;
$html = '<main>
<div><label>Name</label><span class="name">John</span></div>
<div><label>Name</label><span class="name">Patrick</span></div>
</main>';
$dom = HTMLDocument::createFromString($html, LIBXML_NOERROR);
$nodes = $dom->getElementsByClassName('name');
echo $nodes[0]->textContent; // John
This provides a cleaner, more readable approach for targeting class-based elements, especially when working with structured HTML data.
Leave a Comment
Cancel reply