Skip to content

add 'notes' to Request#97

Open
BurnzZ wants to merge 2 commits intomainfrom
request-notes
Open

add 'notes' to Request#97
BurnzZ wants to merge 2 commits intomainfrom
request-notes

Conversation

@BurnzZ
Copy link
Contributor

@BurnzZ BurnzZ commented Jun 20, 2024

A proposal to add some way of putting some kind of information to the Request.

One use case is adding some identifiers to a given request. For example, in https://github.com/zytedata/zyte-spider-templates/blob/main/zyte_spider_templates/pages/product_navigation_heuristics.py#L72, a current workaround would be to add some kind of url prefix to request.name.

It's not pretty but it does the job. However, it'd be hard to properly identify some information here.

It'd be nice if we have some dedicated field to put such arbitrary values. This would be quite useful when developers create page objects with some added notes to the Requests.

@BurnzZ BurnzZ requested review from Gallaecio, kmike and wRAR June 20, 2024 12:11
@Gallaecio
Copy link
Contributor

Gallaecio commented Jun 20, 2024

Something like Request.meta: Dict[str, Any]? 🙂

I wonder if, to make it be in line with the rest of the schema, it should be called metadata or additionalMetadata, and be in the same format as additionalAttributes.

@BurnzZ
Copy link
Contributor Author

BurnzZ commented Jun 21, 2024

Something like Request.meta: Dict[str, Any]? 🙂

I initially implemented it using a dict but quickly realized I don't have something to put in as values. 😅 However, I do see the point in having dict here since:

  • Faster to find something because of the keys.
  • More convenient to structure the data. For example, {"id": 312, "type": "heuristic", "type": "productNavigation"}

Updated this in a92ca86.

I wonder if, to make it be in line with the rest of the schema, it should be called metadata or additionalMetadata, and be in the same format as additionalAttributes.

Not quite sure yet with this one. I think I'm slightly leaning towards it not having to conform with the rest of the schema just to give the idea that values from this field doesn't come from Zyte API Extraction. I was thinking of calling it user_notes but that's too long.

@wRAR
Copy link
Member

wRAR commented Jun 21, 2024

I like the dict type better, I'm not sure if I like the "notes" name but I don't think "*metadata" is better.

@kmike
Copy link
Contributor

kmike commented Jun 21, 2024

+1 to using dicts.

I think it looks more like metadata, not "notes", so +1 to something along these lines.

ProbabilityRequest already have metadata attribute though. Having both "meta" and "metadata" sounds weird. For me it feels more natural to put metadata into the existing metadata attribute.

But how to do it? I think we can consider extending BaseMetadata to include this dict; it'd be something like request.metadata.non_standard or request.metadata.custom. This is a big change, because this "non_standard" metadata is going to be available for all other item types as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants