Skip to content

Fix or rewrite scraper engine and scrapers #259

Open
@dag7dev

Description

Many years have passed since its first release.

Known demozoo bug:

  • it always select the first secondary mirror assuming that is from scene.org. This is wrong and should be carefully checked
  • demozoo is missing the screenshots, it always downloads only the first one

Known scrape engine bugs:

  • sometimes it sets "screenshots" to None: it should set that field to empty list if no screenshot detected
  • it doesn't handle very well other extensions than gb, therefore, it may be a good idea to fix this thing, sometimes in the manifest there is "gbc" but the engine has already renamed the file in gb

Other improvements:

  • change the general logic to become more flexible (e.g. select best source, include other extensions)
  • write a basic test suite
  • test from scratch other scrapers, since they may have become buggy due to change in the master scraper

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions