Tokenize PHP Source Code using PhpToken in PHP 8.0

The token_get_all function parses PHP source code and returns an array of tokens. Each token is represented either as a single character or an array that contains token ID, token text and line number.

<?php

$tokens = token_get_all('<?php echo "Hello world"; ?>');

foreach ($tokens as $token) {
    if (is_array($token)) {
        [$tokenId, $tokenText, $line] = $token;
        echo 'Line '.$line.': '.token_name($tokenId).' ('.$tokenText.')'.PHP_EOL;
    } else {
        echo $token.PHP_EOL;
    }
}

Example will output:

Line 1: T_OPEN_TAG (<?php )
Line 1: T_ECHO (echo)
Line 1: T_WHITESPACE ( )
Line 1: T_CONSTANT_ENCAPSED_STRING ("Hello world")
;
Line 1: T_WHITESPACE ( )
Line 1: T_CLOSE_TAG (?>)

Since PHP 8.0, we can use a PhpToken class and tokenize static method as alternative for token_get_all function. The tokenize method returns an array of PhpToken objects. This approach uses less memory and code becomes more readable and clear.

<?php

$tokens = PhpToken::tokenize('<?php echo "Hello world"; ?>');

foreach ($tokens as $token) {
    echo 'Line '.$token->line.': '.$token->getTokenName().' ('.$token->text.')'.PHP_EOL;
}

Example will output:

Line 1: T_OPEN_TAG (<?php )
Line 1: T_ECHO (echo)
Line 1: T_WHITESPACE ( )
Line 1: T_CONSTANT_ENCAPSED_STRING ("Hello world")
Line 1: ; (;)
Line 1: T_WHITESPACE ( )
Line 1: T_CLOSE_TAG (?>)

Leave a Comment

Your email address will not be published. Required fields are marked *