Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SERP (MVP) #62

Merged
merged 20 commits into from
Sep 17, 2024
Merged

SERP (MVP) #62

merged 20 commits into from
Sep 17, 2024

Conversation

Gallaecio
Copy link
Contributor

@Gallaecio Gallaecio commented Aug 27, 2024

Pagination is currently hard-coded to 10, which is not always accurate I believe. However, the next iteration of SERP extraction will include a nextPage entry, so I think it is best to start with this simple approach and switch to a nextPage-based implementation later.

This implementation does not rely on scrapy-poet because scrapy-zyte-api does not yet support serp properly. To be changed once we add proper support for it there (working on it).

To do:

@Gallaecio Gallaecio requested review from kmike and wRAR August 27, 2024 10:10
CHANGES.rst Show resolved Hide resolved
@kmike
Copy link
Contributor

kmike commented Sep 10, 2024

With the serp item released, what are the obstacles in adding Serp support to scrapy-zyte-api? (anyways, let's handle it separately, it's not blocking anything)

@Gallaecio
Copy link
Contributor Author

I’ve just opened scrapy-plugins/scrapy-zyte-api#218 out of local changes I had started. There is no blocker, but it is not entirely a trivial change, it is not very complex either, but the special behavior of serp compared to other data types does complicate things more than usual, and there are other Zyte API features that are not covered in scrapy-zyte-api yet that we should also cover while adding serp support.

@Gallaecio
Copy link
Contributor Author

@BurnzZ Any quick thought on the crawl logging changes?

zyte_spider_templates/params.py Outdated Show resolved Hide resolved
zyte_spider_templates/spiders/serp.py Show resolved Hide resolved
zyte_spider_templates/spiders/serp.py Outdated Show resolved Hide resolved
zyte_spider_templates/spiders/serp.py Outdated Show resolved Hide resolved
zyte_spider_templates/spiders/serp.py Outdated Show resolved Hide resolved
Copy link
Member

@BurnzZ BurnzZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Gallaecio , I just tested and reviewed the changes pertaining to the crawling logs. All looks well. 👍

Left a few minor comments as well. Aside from that, LGTM!

@Gallaecio
Copy link
Contributor Author

Release notes added, I’ll handle the release as soon as we merge.

@Gallaecio Gallaecio merged commit 426c0c7 into zytedata:main Sep 17, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants