You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+35-8Lines changed: 35 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,13 +26,35 @@ See [tests/README.md](https://github.com/duzun/hQuery.php/blob/master/tests/READ
26
26
- PHP 5.3+
27
27
- No dependencies
28
28
29
+
## Requirements
30
+
31
+
- PHP 5.3 or newer (PHP 7.4+ recommended)
32
+
-`mbstring` extension is recommended for reliable charset handling and conversions
33
+
- Ensure a sufficient `memory_limit` when working with very large documents
34
+
29
35
## 🛠 Install
30
36
31
-
Just add this folder to your project and `include_once 'hquery.php';` and you are ready to `hQuery`.
37
+
Add the library to your project and include it, or install via Composer/npm.
38
+
39
+
Using Composer (recommended):
40
+
41
+
```sh
42
+
composer require duzun/hquery
43
+
```
32
44
33
-
Alternatively `composer require duzun/hquery`
45
+
Or include manually:
34
46
35
-
or using `npm install hquery.php`, `require_once 'node_modules/hquery.php/hquery.php';`.
47
+
```php
48
+
include_once '/path/to/hquery.php/hquery.php';
49
+
```
50
+
51
+
Or via npm:
52
+
53
+
```sh
54
+
npm install hquery.php
55
+
```
56
+
57
+
Then require the file from `node_modules` if needed.
36
58
37
59
## ⚙ Usage
38
60
@@ -160,19 +182,19 @@ $titles = array();
160
182
if ( $banners ) {
161
183
162
184
// Iterate over the result
163
-
foreach($banners as $pos => $a) {
185
+
foreach($banners as $id => $a) {
164
186
// $a->href property is the resolved $a->attr('href') relative to the
165
187
// documents <basehref=...>, if present, or $doc->baseURL.
166
-
$links[$pos] = $a->href; // get absolute URL from href property
167
-
$titles[$pos] = trim($a->text()); // strip all HTML tags and leave just text
188
+
$links[$id] = $a->href; // get absolute URL from href property
189
+
$titles[$id] = trim($a->text()); // strip all HTML tags and leave just text
168
190
169
191
// Filter the result
170
192
if ( !$a->hasClass('logo') ) {
171
193
// $a->style property is the parsed $a->attr('style'), same as $a->attr('style', true)
172
194
if ( strtolower($a->style['position']) == 'fixed' ) continue;
173
195
174
196
$img = $a->find('img')[0]; // ArrayAccess
175
-
if ( $img ) $images[$pos] = $img->src; // short for $img->attr('src', true)
197
+
if ( $img ) $images[$id] = $img->src; // short for $img->attr('src', true)
176
198
}
177
199
}
178
200
@@ -201,7 +223,12 @@ $requestUri = $doc->href;
201
223
$baseURL = $doc->baseURL;
202
224
```
203
225
204
-
Note: In case the charset meta attribute has a wrong value or the internal conversion fails for any other reason, `hQuery` would ignore the error and continue processing with the original HTML, but would register an error message on `$doc->html_errors['convert_encoding']`.
226
+
Charset and positions:
227
+
228
+
- The document is converted internally to UTF-8 for parsing.
229
+
- Element positions (the numeric offsets used internally and returned by APIs that expose byte offsets) refer to the internal UTF-8 string bytes.
230
+
231
+
Note: In case the charset meta attribute has a wrong value or the internal conversion fails for any other reason, `hQuery` will continue processing with the original HTML, but will register an error message on `$doc->html_errors['convert_encoding']`.
0 commit comments