Skip to content

Cache parsing and preprocessing results #179

@Pennycook

Description

@Pennycook

Feature/behavior summary

Running codebasin, cbi-cov or cbi-tree can take a long time for large projects, because each file must be parsed and preprocessed each time the command is run.

We could make things significantly faster (and more easily enable new use-cases) if we cached parsing and preprocessing results between command invocations.

Request attributes

  • Would this be a refactor of existing code?
  • Does this proposal require new package dependencies?
  • Would this change break backwards compatibility?

Related issues

No response

Solution description

To cache parse results:

  • Introduce a way to serialize specialization trees
  • Cache the specialization tree, using something like a hash of the file contents as a key
  • Skip parsing if a specialization tree already exists in the cache

To cache preprocessing results:

  • Cache coverage JSON files, using a combination of the specialization tree and PreprocessorConfiguration
  • Skip preprocessing if coverage for a specialization tree already exists in the cache

Additional notes

There are a few options for where we could save these files, with different trade-offs.

Using .cbi/cache would result in a per-project cache, which may help developers to identify that there is a cache and investigate how it is being used.

Using .cache/cbi/ would result in a per-user cache, which might be easier to handle (because we wouldn't have to worry about concurrent updates to the cache from multiple users), and which would allow for a single cache to store the results of common files (e.g., library headers).

I'm leaning towards .cache/cbi, with an option to allow developers to specify a different cache location.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions