Job resource (items/requests/logs) iterator may return more entries than specified via --tail

**Version Affected**

2871384 (current HEAD of master)

**Observations***

In order to monitor a long-running job I did something like this:
```
$ watch -n 60 "TZ=UTC date --iso-8601=sec | tr '\n' ' ' | tee -a test_output && shub items <scrapy_cloud_job_id> -n 1 | jq '.some_field' | tee -a test_output"
```

The command intents to fetch one item per minute.
So far it's been running for ~1h and the log suggests there were once ~180 items fetched during one `shub items -n 1` command.

**Analysis**

In [`shub.utils.job_resource_iter`](https://github.com/scrapinghub/shub/blob/287138466558cd0255818465a88fd65b71efa9df/shub/utils.py#L533) the tail-related logic is like this:
1. Fetch `resource.stats()` and retrieve the total number of entries within.
2. Calculate the corresponding index to start with: `last_item = total_nr_items - tail - 1`
3. Fetch the resource starting from the pre-calculated index: `resource_iter(startafter=last_item_key)`

However, there may be new entries added between the 1st and 3rd step, and any newly added entries would be also returned.

**Proposal**

There may be a `count` parameter added to the `resource_iter` call (e.g. `resource_iter(startafter=last_item_key, count=tail)`).

It's assumed to be okay to return no more than N items when a user has `--tail N` while there's no `--follow`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Job resource (items/requests/logs) iterator may return more entries than specified via --tail #374

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Job resource (items/requests/logs) iterator may return more entries than specified via --tail #374

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions