You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: align enqueueLinks upgrade guide with unified include/exclude API
Update the v4 upgrade guide and sitemap example to reflect the
globs/regexps/pseudoUrls -> include collapse, removal of PseudoUrl and
per-pattern request options, and corrected transformRequestFunction
precedence.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
### `globs`, `regexps`, and `pseudoUrls` replaced by `include`
901
+
902
+
To align with the Crawlee for Python API, the separate `globs`, `regexps`, and `pseudoUrls` URL-filtering options of `enqueueLinks()`, the click-elements enqueue helpers, and `SitemapRequestLoader` have been collapsed into a single `include` option (mirroring the already-unified `exclude` option). Each entry of `include`/`exclude` can be a glob string, a `RegExp`, or a `{ glob }` / `{ regexp }` object.
903
+
904
+
The `PseudoUrl` class is no longer exported and the `@apify/pseudo_url` dependency has been dropped. Rewrite any pseudo-URL patterns as globs or regular expressions.
905
+
906
+
Per-pattern request options (`label`, `userData`, `method`, `payload`, `headers` set directly on a pattern object) are no longer supported. Use the top-level `label` / `userData` options, or `transformRequestFunction`, to set request options for the enqueued requests.
## `transformRequestFunction` precedence in `enqueueLinks`
901
925
902
-
The `transformRequestFunction` callback in `enqueueLinks` now runs **after** URL pattern filtering (`globs`, `regexps`, `pseudoUrls`) instead of before. This means it has the highest priority and can overwrite any request options set by patterns or the global `label`option.
926
+
The `transformRequestFunction` callback in `enqueueLinks` now runs **after** URL pattern filtering (`include`, `exclude`) instead of before. This means it has the highest priority and can overwrite any request options set by the global `label`/ `userData` options.
903
927
904
928
The priority order is now (lowest to highest):
905
929
1. Global `label` / `userData` options
906
-
2. Pattern-specific options from `globs`, `regexps`, or `pseudoUrls` objects
907
-
3.`transformRequestFunction`
930
+
2.`transformRequestFunction`
908
931
909
932
The `transformRequestFunction` callback receives a `RequestOptions` object and can return either:
0 commit comments