-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Currently there are three methods to add ItemLoader processor:
- The
default_input/output_processoron theItemLoaderclass - The
field_name_in/outon theItemLoaderclass - The
input/output_processoron thescrapy.Field
Personally I use the input/output_processor on the scrapy.Field combined with the default_input/output_processor a lot. But I use those in combination. Often I just want to add one more processor after the default processors. Since input/output_processor on scrapy.Field does a override of the defaults this is quite hard to do.
So I would propose to add another method to add a input/output processors. I would like to have something like add_input/output on the scrapy.Field, which would add the specified processor to the default processor.
I did implement this on my own ItemLoader class but think that it would be usefull for the scrapy core. My implementation is as follows (original source: https://github.com/scrapy/scrapy/blob/master/scrapy/loader/__init__.py#L69). Ofcourse this can be added to get_output_processor in the same way.
def get_input_processor(self, field_name):
proc = getattr(self, '%s_in' % field_name, None)
if not proc:
override_proc = self._get_item_field_attr(field_name, 'input_processor')
extend_proc = self._get_item_field_attr(field_name, 'add_input')
if override_proc and extend_proc:
raise ValueError(f'Not allowed to define input_processor and add_input to {field_name}')
if override_proc:
return override_proc
elif extend_proc:
return Compose(self.default_input_processor, extend_proc)
return self.default_input_processor
return proc
I am not sure if add_input is a good name, probably extend_input_processor is more clear but this quite a long name. I would like to hear if more people are wanting this feature and what you all think about what the naming should be.