Feature add batchwise filtered reading by smoosbau · Pull Request #32 · seemoo-lab/pyshimmer

smoosbau · 2025-12-04T22:55:55Z

As I needed a possibility to (a) filter the data read from the binary files and (b) read the data batchwise due to the size (my measurements take about 8h (sampling rate 1024Hz)) I added this functionalities.

Filtering is added to the complete BinaryReader class, while batchwise reading is realized using a new subclass.

This Pull Request is currently a draft, as I'm not sure wether the maintainers prefere another realization of that features or discard it completet.
Thus, I'd like to ask @lumagi to review this draft and share his thoughts with me. Afterwards, I can either proceed with documentation and tests, change the design / implementation, or delete this PR.

Any feedback is appreciated. :)

lumagi · 2025-12-06T17:27:48Z

Hi, thank you for the PR and for the effort. Could you please format the code with Black? It's not really mentioned anywhere yet, but I recently formatted the entire code base with it. It looks like there are some formatting changes in the diff.

lumagi · 2025-12-06T18:27:31Z

@smoosbau I integrated the black code check into the pipeline. Can you merge main into your PR? Then it should automatically pick up the check as part of the pipeline.

Basti-iTAS-PLMS · 2025-12-10T08:54:44Z

Hi @lumagi,
sorry, I was busy the last days.
I'll make the requested changes asap! Thanks for checking my PR. :)

smoosbau · 2025-12-15T09:51:02Z

@lumagi, formatting is done.
Please let me know what to do next. :)

lumagi · 2025-12-21T12:51:06Z

Hi, I took a thorough look through the code. Do I understand correctly that the reason behind your batched approach is to avoid loading all the 8h data into memory? I like the batched approach that you are using. I even think it is possible to still use a single Reader class that offers both a batched and a none-batched approach. If all data should be read, the Reader class internally uses the batched approach to read all data into a pre-created buffer. If the user wants to use a batched approach, they can directly use the public method for reading the data batched. I specifically like the yield semantics in Python for this. What do you think about this? Then there is no algorithmic duplication in the code.

lumagi · 2025-12-21T12:52:03Z

I didn't do this when I first wrote the code, but the data array for the non-batched read approach can even be preallocated since the file size is known.

lumagi · 2025-12-21T12:57:53Z

I have yet another question: Why did you always reset the sample counter for the synchronization offset in your batched read operation to 0? Would it not make more sense to keep the sample counter for proper global synchronization?

smoosbau · 2026-01-12T10:46:38Z

Hi @lumagi,
sorry, for the late response. I've been on vacation.
I like the idea of fusing the batched-wise and complete reading approach into one. Didn't do it, because I wanted to keep code changes small (according to my result, achievement of this goal can be discussed :D).

According to the counter, I need to re-check that. I'm currently not sure which counter you mean.

Futhermore, I have some additional ideas for improvement we could / should discuss.
Are you in for a chat elswhere (i.e. MS Teams or Discord)?
I would like to boost reading time by not seeking so much through the files and avoid reloading the complete header everytime, as most of the data in there is static.
If wee agree on common goals, we could add issues and I could work on them. What do you think about that?

Sebastian Moosbauer and others added 7 commits December 3, 2025 18:04

introduce filtering

17bd708

implement batchwise reading

c197031

remove duplicate code

f04c67d

remove dead code

afbc6c6

fix wrong indentation

5035df7

further improve code base

4cde6ef

implement filtering and streaming

3903adf

Sebastian Moosbauer added 3 commits December 15, 2025 10:40

fix filter bug

632e2c9

Merge branch 'main' into feature_add_batchwise_filtered_reading

8d62241

format code using black

4a1cc07

smoosbau marked this pull request as ready for review December 15, 2025 09:50

lumagi mentioned this pull request Feb 10, 2026

New Shimmer hardware revision 3R #24

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature add batchwise filtered reading#32

Feature add batchwise filtered reading#32
smoosbau wants to merge 10 commits intoseemoo-lab:mainfrom
smoosbau:feature_add_batchwise_filtered_reading

smoosbau commented Dec 4, 2025

Uh oh!

lumagi commented Dec 6, 2025

Uh oh!

lumagi commented Dec 6, 2025

Uh oh!

Basti-iTAS-PLMS commented Dec 10, 2025

Uh oh!

smoosbau commented Dec 15, 2025

Uh oh!

lumagi commented Dec 21, 2025

Uh oh!

lumagi commented Dec 21, 2025

Uh oh!

lumagi commented Dec 21, 2025

Uh oh!

smoosbau commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

smoosbau commented Dec 4, 2025

Uh oh!

lumagi commented Dec 6, 2025

Uh oh!

lumagi commented Dec 6, 2025

Uh oh!

Basti-iTAS-PLMS commented Dec 10, 2025

Uh oh!

smoosbau commented Dec 15, 2025

Uh oh!

lumagi commented Dec 21, 2025

Uh oh!

lumagi commented Dec 21, 2025

Uh oh!

lumagi commented Dec 21, 2025

Uh oh!

smoosbau commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants