Available filters

SPADE includes a set of filters to manipulate provenance metadata before it is committed to storage. They are described below.

Blacklist

This filter is used to exclude files (based on their name) from being committed to persistent storage. The regular expression for matching filenames should be specified in cfg/blacklist.filter.config.

OPM2PROV

This filter translates OPM vertex and edge elements into corresponding W3C PROV ones.

Fusion

The Fusion filter can be used to merge vertices from related provenance streams. The configuration for this filter is stored in cfg/fusion.config and has the following format:

-- BEGIN FILE --
<1st reporter>
<2nd reporter>
<1st reporter>.<annotation>=<2nd reporter>.<annotation>
...
-- END FILE --

To merge the two streams, the names of both reporters must be specified on the first two lines of the config file.

Next, rules can be specified on which to merge annotations. These rules are specified as <1st reporter>.<annotation>=<2nd reporter>.<annotation>.

The Fusion filter will check to see if the incoming vertices satisfy the merging rules. If vertices are found that match the criteria, they are fused into a single vertex.

IORuns

Reads and writes in an operating system often occurs as runs of one or the other type. For example, a single function that reads in a file may result in multiple read system calls. This can result in a high volume of provenance metadata, especially when reading or writing large files. The IORuns filter can be used to fuse consecutive edges of the same type of I/O operation (i.e., either read or write) into a single edge.

CycleAvoidance

This filter tracks the ancestors of a file and creates a new version each time a new ancestor is encountered.

GraphFinesse

This filter tracks the entire lineage graph of a file and creates a new version if a new edge would have created a cycle.

This material is based upon work supported by the National Science Foundation under Grants OCI-0722068, IIS-1116414, and ACI-1547467. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Setting up SPADE
Storing provenance
Collecting provenance
- Across the operating system
- Limiting collection to a part of the filesystem
  - On Linux
  - On macOS
- From an external application
- With compile-time instrumentation
- Using the reporting API
- Of transactions in the Bitcoin blockchain
- Filtering provenance
  - Using filters
  - Available filters
Viewing provenance
- In a graph database
- In a relational database
Querying SPADE
- Illustrative example
- Transforming query responses
  - Using transformers
  - Available transformers
- Protecting query responses
Miscellaneous

Available filters

Blacklist

OPM2PROV

Fusion

IORuns

CycleAvoidance

GraphFinesse

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally