-
Notifications
You must be signed in to change notification settings - Fork 53
Loan statistics #1256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loan statistics #1256
Conversation
|
|
||
| if creation_date and start_date: | ||
| waiting_time = (start_date - creation_date).days | ||
| stats["waiting_time"] = waiting_time if waiting_time >= 0 else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to return 0 than None?
If we want to distinguish self-checkout from normal loans, this won't be enough (since a normal loan can be retrieved on the same day).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wanted to make sure there are no negative values, which would skew the statistics. Like when doing a sum over the waiting time and there is one loan where the start_date was manually set before the creation_date I don't think it should influence the waiting time statistic. But it might be good to have a discussion about how to handle such cases?
Regarding distinguishing self-checkout and normal loans: one can filter with the q parameter for the delivery option.
| description="Each group_by item must be a dict with 'field' key" | ||
| ) | ||
|
|
||
| _validate_os_field_name(grouping["field"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to have an allow list instead of a regexp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would surely be possible, but might limit in how the endpoint can be used and it requires more work from our side to keep track of all possible fields.
The histogram will be implemented for multiple record types and adding+maintaining an allow list might be tedious.
Also the endpoint is behind the backoffice permission, so I do not think it is too critical?
| actions = [action for action in self.actionsiter()] | ||
| res = search.helpers.bulk(self.client, actions, stats_only=True, chunk_size=50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you maybe clarify what this part does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment. Hope that makes it more clear
5a988e2 to
6bce595
Compare
|
|
||
| # Document availability during loan request | ||
| stat_events_index_name = "events-stats-loan-transitions" | ||
| if current_search_client.indices.exists(index=stat_events_index_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to fail rather than silently not index things if this conditional statatement is not fulfiled. ping @ntarocco WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for context: the check was added because if there is no event at the moment a loan is indexed, there will be no events index and the loan indexer will throw an error in the lines below the if, leading to not/wrongly indexed loans.
I could manually create the index in our systems during deploy and add it to the release notes that the stats index needs to be created for the system to work. But then the setup script should also create the index for new systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, if it is required before running this code (weird, shouldn't it be created automatically if not existing?), then I would fail instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a warning when this happens but still would not fail. While the index is required for adding a specific stat field (namely available_items_during_request) to the loans, the stat is completely optional and I would not let the whole indexing of the loan fail because of it.
Also, the required index events-stats-loan-transitions is handled by invenio-stats. It is created up to 30 minutes after a loan goes through a transition for the first time. (30 minutes because the indexing of the events for invenio-stats is handled by a celery beat task)
Making it fail would also crash all tests and make new systems fail for at least the first 30 minutes.
If we really want the loan indexer to fail here I see only the following solution to execute for all tests and new systems:
- Manually create the index in
invenio-app-ils. I think this is a bad idea because the index is created and managed byinvenio-stats. - Creating a loan, sending it through a transition and forcing
invenio-statsto index the event. This feels super hacky.
Also note:
If this happens to a loan and it skips this stat, and an event for this loan later gets indexed, the loan will also be reindexed and collect the missing information.
| try: | ||
| parsed_args = schema.load(request.args.to_dict()) | ||
| except ValidationError as e: | ||
| raise InvalidParameterError(description=e.messages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this returning json error in REST?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is. InvalidParameterException inherits from class RESTException.
* the endpoint returns a histogram for loans where requested metrics are grouped by and aggregated * extend the loans with a new stats object * the stats object contains `waiting_time`, `loan_duration` and `available_items_during_request`
6bce595 to
b82b815
Compare
…not exist * previously the loan indexing failed silently if the index `events-stats-loan-transitions` did not exist. * This only happens when no loan transition event was indexed yet.
ab2ced6 to
5c9856b
Compare
waiting_time,loan_durationandavailable_items_during_request(closes: Library Statistics - 2. & 3. - Average duration of loan & Availability of requested documents CERNDocumentServer/cds-ils#1029)tests are failing because of dependency on this PR