Skip to content

Support for pagination in get_revisions #905

@Aamir377300

Description

@Aamir377300

The get_revisions in perceval/backends/core/mediawiki.py doesn't actually paginate yet. If a page has a massive edit history, the tool only pulls the first 500 results and stops because it doesn't account for the API limits.

There’s actually a TODO right in the code that points this out:

TODO: Iterate if more than self.max reviews (500)

The goal is to:

  • Loop through the API responses until the full history is fetched using the continue parameter.
  • Keep the existing last_date filtering intact so we don't over-fetch.
  • Ensure it stays compatible with the current request flow.

I think this would be a big plus for anyone using Perceval on high-activity wikis where the current partial datasets might lead to inaccurate analysis.

@sduenas please assign me this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions