Skip to content

Conversation

@shanedsnyder
Copy link

@shanedsnyder shanedsnyder commented Mar 11, 2025

Enable filtering of PyDarshan report record data to exclude/include record names matching some given filter patterns.

This functionality is integrated into DarshanReport objects directly, with relevant routines (e.g., constructor, open(), read_all(), other log read routines) all taking filter_patterns and filter_mode arguments. filter_patterns is a list of Python regex strings to match against and filter_mode is either "exclude" (don't load any records that match strings in filter_patterns) or "include" (only load records that match strings in filter_patterns).

This functionality is exposed to the PyDarshan report summary tool via --exclude_names and --include_names command line arguments. Only one of these options may be provided. In either case, the strings supplied to these command line arguments are used to form the regex list to filter (exclude or include) against. For example, the following could be used to generate a summary report containing only file record names starting with /file_dir/ or ending in .txt:

python -m darshan summary --include_names="^/file_dir/" --include_names="\.txt$" logfile.darshan

Special logic was integrated into the job summary tool and lower-level aggregation/plotting routines to properly handle cases when all records for a given module have been filtered out (i.e., the Darshan log metadata indicates a module has data, but filtering has resulted in no records being stored in memory for the module).

Testing changes included for DarshanReport objects and lower-level plot routines.

@shanedsnyder shanedsnyder force-pushed the snyder/pydarshan-name-filters branch from 388fdeb to a4aef23 Compare April 25, 2025 20:20
@shanedsnyder shanedsnyder changed the title WIP: PyDarshan Report changes to enable name filtering ENH: PyDarshan Report changes to enable name filtering Apr 26, 2025
@shanedsnyder shanedsnyder merged commit 4147a4a into main Apr 30, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants