Configurable inmediate redirection #433

dgoiko · 2020-01-25T20:09:17Z

WebURLs and WebCrawler now supports for individual URLs to be followed right away even if they were already visited. They will not be scheduled, but processed.

I needed to implement this because I found a site that used a common URL for redirections and based content on it's internal session or something I could't figure out.

Even if I managed to schedule visited URLs again, after scheduling all'em showed the same content: the one referenced in the last "previous page" visited. After allowing the crawler to visit sites inmediatly, the problem was solved.

Since this can generate non-desired infinite redirection loop, there's a maximum automatic redirection depth that can be configured on WebURLs: maxInmediateRedirects.

By default, this vehaviour is disabled. The creator of the WebURL is responsible of enabling it on a per-URL basis

WebURLs and WebCrawler now supports for individual URLs to be followed right away even if they were already visited. They will not be scheduled, but processed.

dgoiko added 2 commits January 25, 2020 21:05

Configurable inmediate redirection

c825511

WebURLs and WebCrawler now supports for individual URLs to be followed right away even if they were already visited. They will not be scheduled, but processed.

Style fixes

ef25813

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Configurable inmediate redirection #433

Configurable inmediate redirection #433

Uh oh!

dgoiko commented Jan 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Configurable inmediate redirection #433

Are you sure you want to change the base?

Configurable inmediate redirection #433

Uh oh!

Conversation

dgoiko commented Jan 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant