Skip to content

Drop support for ?different query param #1232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Mr0grog
Copy link
Member

@Mr0grog Mr0grog commented Apr 4, 2025

This covers part 1 of #1208 and generally removes support for the different query param, which doesn't behave like you would probably expect and regularly causes problems. It was designed in a world where we expected to have such frequent and continuous captures of pages that any time we had a "different" version, we knew it had just changed. The reality is that you can never get captures frequent enough to do that even in practical terms, and in theoretical terms it's just wrong. We've long since switched to always querying in a way that removes checking the different field, and it's past time to remove it from the API.

This leaves it in place in ONE spot of the API, which does not return an actual list of versions and thus does not have the same problems: GET /pages?capture_time=abc...def implicitly checks the different column because the goal is to tell you what pages had observed changes in that time period. We could back off that idea and just have it check whether there were any captures, but I think the valuable use cases for that are pretty rare. But maybe I’ll wind up revisiting that.

This also drops Version.invalid_changes, which I noticed is:

  1. Not (no longer?) definitionally sensible, and
  2. Not used anywhere.

Mr0grog added 2 commits April 3, 2025 18:34
This covers part 1 of #1208 and generally removes support for the `different` query param, which doesn't behave like you would probably expect and regularly causes problems. It was designed in a world where we expected to have such frequent and continuous captures of pages that any time we had a "different" version, we knew it had *just* changed. The reality is that you can never realistically get captures frequent enough to do that even in practical terms, and in theoretical terms it's just wrong. We've long since switched to always querying in a way that removes checking the `different` field, and it's time to just remove it from the API.

This leaves it in place in ONE spot of the API, which does not return an actual list of versions and thus does not have the same problems: `GET /pages?capture_time=abc...def` implicitly checks the `different` column because the goal is to tell you what pages had observed changes in that time period. We could back off that idea and just have it check whether there were any captures, but I think the valuable use cases for that are pretty rare (especially if we are mainly focused on pages where `active=true`, and we expect that every page should have captures at roughly the same times).
The definition of this relation no longer really makes sense, and also isn't used.
Copy link
Member Author

@Mr0grog Mr0grog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looks good with fresh eyes today, and the messy migration of version headers is done now. Should be good to merge!

@Mr0grog Mr0grog merged commit e111d81 into main Apr 4, 2025
5 checks passed
@Mr0grog Mr0grog deleted the 1208-different-is-just-too-different-from-what-you-expect branch April 4, 2025 23:01
@github-project-automation github-project-automation bot moved this from Inbox to Done in Web Monitoring Apr 4, 2025
github-actions bot pushed a commit that referenced this pull request Apr 4, 2025
Mr0grog added a commit to edgi-govdata-archiving/web-monitoring-ops that referenced this pull request Apr 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

1 participant