ENH: PyDarshan Report changes to enable name filtering #1017
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enable filtering of PyDarshan report record data to exclude/include record names matching some given filter patterns.
This functionality is integrated into DarshanReport objects directly, with relevant routines (e.g., constructor,
open(),read_all(), other log read routines) all takingfilter_patternsandfilter_modearguments.filter_patternsis a list of Python regex strings to match against andfilter_modeis either "exclude" (don't load any records that match strings infilter_patterns) or "include" (only load records that match strings infilter_patterns).This functionality is exposed to the PyDarshan report summary tool via
--exclude_namesand--include_namescommand line arguments. Only one of these options may be provided. In either case, the strings supplied to these command line arguments are used to form the regex list to filter (exclude or include) against. For example, the following could be used to generate a summary report containing only file record names starting with/file_dir/or ending in.txt:python -m darshan summary --include_names="^/file_dir/" --include_names="\.txt$" logfile.darshanSpecial logic was integrated into the job summary tool and lower-level aggregation/plotting routines to properly handle cases when all records for a given module have been filtered out (i.e., the Darshan log metadata indicates a module has data, but filtering has resulted in no records being stored in memory for the module).
Testing changes included for DarshanReport objects and lower-level plot routines.