Open
Description
Hey @technosophos,
Before going into my issue just wanted to say I love your work on QueryPath!
As for the issue I was wondering if you would have any advice on what I could be doing wrong and why QueryPath seems to be ignoring the fact that a string is valid UTF-8.
<?php
// Parse the HTML using QueryPath
$qp_options = array(
'convert_from_encoding' => 'UTF-8',
'convert_to_encoding' => 'UTF-8',
'strip_low_ascii' => FALSE,
);
//Taxonomy
$this->qp = htmlqp($dbRow->BreadCrumbHTML, NULL, $qp_options);
$taxonomy = $this->qp->top()->find('ul li:last')->text();
Where the content of $dbRow->BreadCrumbHTML is:
<ul><li style="display:inline;"><a href="/fr/index.html">Accueil</a></li> > <li><a href="/fr/roads_trans/index.html">Routes et transports</a></li> > <li>Vélo</li></ul>
and the string I get returned for $taxonomy is:
"Vélo"
If I don't use querypath and just get the whole text the UTF-8 is maintained. I did also check to make sure mb_convert_encoding is being called and it does work and maintain the UTF-8 Encoding at that point in xdebug (PHP 5.3.9). Would you have any sagely advice on this on particular routes to further debug?
Metadata
Metadata
Assignees
Labels
No labels