Skip to content

fitsdiff and pytest assertion rewriting leads to large memory increases when tests fail #10489

@braingram

Description

@braingram

assert diff.identical, diff.report()

is a common pattern for regression tests. By default pytest will rewrite this assert to capture additional information about the failure (beyond the AssertionError and message). This includes capturing a reference to the diff instance that lives throughout the duration of the test session. Since diff contains references to the input files (and other objects, note that this is not specific to stfitsdiff and also applies to astropy fitsdiff):
if len(self.a) != len(self.b):

this likely contributed to the recent termination of regtest runs where:

  • many failures were introduced by a context change
  • these failures caused pytest to hold only large amounts of memory
  • memory usage exceeded that available on the worker node
  • worker node terminated before information about failures were reported

There are a few options for addressing this including:

  • disabling pytest assertion rewriting. This would have minimal or no impact on asserts like the above but would produce less useful information for other asserts.
  • modify fitsdiff to not hold so many references.

Taking the above example:

diff = FITSDiff(rtdata.output, rtdata.truth, **fitsdiff_default_kwargs)
assert diff.identical, diff.report()

fitsdiff could be modified to compute identical and report and release all references to data arrays, input files, etc. This approach would be complicated by other fitsdiff usage (that doesn't just access identical and report) so perhaps this pre-computation and reference release might make sense to capture in some regtest-specific code. Something like:

def assert_output_matches_truth(output, truth, **kwargs):
    diff = FITSDiff(rtdata.output, rtdata.truth, **fitsdiff_default_kwargs)
    identical = diff.identical
    report = diff.report()
    del diff  # not sure if this is 100% needed
    assert identical, report

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions