Skip to content

[feature request] ignore nested fields  #89

@elacuesta

Description

@elacuesta

As discussed with @fcanobrash, I'm only opening this here so we can keep track if it.

The AUTOUNIT_DONT_TEST_OUTPUT_FIELDS setting cannot be used to ignore fields that are not in the first level of items. We thought about using jmespath but that doesn't provide a way to modify data, only access it. My current workaround for this is the following monkeypatch:

import operator
from contextlib import suppress
from functools import reduce

import scrapy_autounit
from scrapy_autounit.cassette import Cassette
from scrapy_autounit.player import Player


class IgnoreNestedFieldsPlayer(Player):
    """Patched player that allows to specify nested fields to be ignored.
    """

    @classmethod
    def from_fixture(cls, path):
        """This override is only needed while https://github.com/scrapinghub/scrapy-autounit/pull/88 is not merged"""
        cassette = Cassette.from_fixture(path)
        return cls(cassette)

    def _filter_output_fields(self, item):
        dont_test = self.spider.settings.get("AUTOUNIT_DONT_TEST_OUTPUT_FIELDS", [])
        if not dont_test:
            dont_test = self.spider.settings.get("AUTOUNIT_SKIPPED_FIELDS", [])
        for entry in dont_test:
            *first_keys, last_key = entry.split(".")
            if first_keys:
                with suppress(KeyError):
                    item = reduce(operator.getitem, first_keys, item)
            item.pop(last_key, None)


scrapy_autounit.player.Player = IgnoreNestedFieldsPlayer

After this I'm able to do the following in settings.py:

AUTOUNIT_DONT_TEST_OUTPUT_FIELDS = ["metadata.found_date", "metadata.updated_date"]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions