-
Notifications
You must be signed in to change notification settings - Fork 164
Improve pagination handling in Confluence API client #3321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
buildkite test this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting these! Would you be willing to add some unit tests as well, to exercise your changes and demonstrate the situations where you hit them?
except Exception as exception: | ||
self._logger.warning( | ||
f"Skipping data for type {url_name} from {url}. Exception: {exception}." | ||
f"Skipping data for type {url_name} from {base_url}. Exception: {exception}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd actually still want to log out url
, since it would let us know which "page" of results had an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only reason I switched this to base_url is because it's possible that url does not exist if try step fails (or at least so I think it could happen)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Ok, hadn't caught this, but I think url
needs to be defined outside of the loop to start. That way we can reference it here, and we also can be sure that it's not re-set on each iteration (see my most recent comment)
Also, the linter is failing. You can autoformat your code with |
@WildDogOne when is the |
@artem-shelkovnikov In the connector when you use an advanced filter in sync rules. For Example:
This will then need to use /search because there is a CQL Query to be run on ingest, which makes more sense compared to using the post filter |
Co-authored-by: Sean Story <[email protected]>
buildkite test this |
while True: | ||
try: | ||
url = f"{base_url}&start={start}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line probably needs to be outside of the while True
, otherwise it'll overwrite the last iteration's attempts to set the URL to the "next" link, if that strategy was used.
Closes #3320
The code is here to enhance the pagination handling in case the /rest/content/search API is used since there is no next link in that api.
I would be happy if someone could check if my logic is somewhat acceptable.
There is also a trade off right now since there seems to be no direct tracking of which data has already been processed in the script itself, so every incremental sync will pull all data again, but of course then only process the new files.
This means, if someone set's a filter which is extremely general, this could result in a huge amount of API calls
Checklists
Pre-Review Checklist
config.yml.example
)v7.13.2
,v7.14.0
,v8.0.0
)