Skip to content

Link Discovery given a known source #72

@marianelamin

Description

@marianelamin

Problem

At some point it would be desirable to scrape data of interest given (only) the base domain/url of the source.

Proposal

This issue is meant to address the design of that component that would find the full url within a website of interest, and return it so it could be use to scrape more data.
In short: given the base url, return a full path url that would take us directly to the page that contains interesting data. For example:

elpitazo.net
    |
    v
| full path retrieval | -> https://elpitazo.net/los-llanos/el-gas-domestico-en-acarigua-araure-cuesta-entre-10-y-20-dolares/

Sub-Objectives

TBD

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationhelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions