Skip to content

Consider additional dependencies for performance, security #29

@bollwyvl

Description

@bollwyvl

Thanks again for graphtage!

While I haven't used XML diffing in anger yet, it would be interesting to explore some (optional) dependencies to increase the robustness and performance of that component:

  • lxml has the same API, but better performance, than stdlib
  • defusedxml helps prevent well-known malicious XML attacks that works with stdlib or lxml

Similarly, a number of far-higher performance JSON parsers are available, with different ease-of-installation/speed/memory tradeoffs for which it might be hard to anticipate user preference:

If there is interest, I could probably take a stab at a PR for this:

  • change the json API to accept an optional parser
    • add extras with a sensible bottom version pins
  • change the xml API to accept an optional parser
    • add defusedxml in install_requires
    • add lxml in an extras section
      • or install_requires, as "complexity of installation" is no longer really a concern once scipy enters the picture...
  • test against different combinations with tox in CI

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions