-
Notifications
You must be signed in to change notification settings - Fork 217
Open
Description
https://github.com/scrapinghub/frontera/blob/master/frontera/core/manager.py
I use 0.8.1 code base in LOCAL_MODE,
The KeyError throw when running to to_fetch in StateContext class:
from line 801:
class StatesContext(object):
...
def to_fetch(self, requests):
requests = requests if isinstance(requests, Iterable) else [requests]
for request in requests:
fingerprint = request.meta[b'fingerprint'] # error occured here!!!
I think the reason is the meta b'fingerprint' used before it's setting:
from line 302:
class LocalFrontierManager(BaseContext, StrategyComponentsPipelineMixin, BaseManager):
def page_crawled(self, response):
...
self.states_context.to_fetch(response) # here used b'fingerprint'
self.states_context.fetch()
self.states_context.states.set_states(response)
super(LocalFrontierManager, self).page_crawled(response) # but only here init!
self.states_context.states.update_cache(response)
from line 233:
class BaseManager(object):
def page_crawled(self, response):
...
self._process_components(method_name='page_crawled',
obj=response,
return_classes=self.response_model) # b'fingerprint' will be set when pipeline go through here
My corrent work aroud is add the line to to_fetch method of StateContext class:
def to_fetch(self, requests):
requests = requests if isinstance(requests, Iterable) else [requests]
for request in requests:
if b'fingerprint' not in request.meta:
request.meta[b'fingerprint'] = sha1(request.url)
fingerprint = request.meta[b'fingerprint']
self._fingerprints[fingerprint] = request
What is the collect way to fix this?
yujiaao
Metadata
Metadata
Assignees
Labels
No labels