Skip to content

Add pytest tests for Pod 5 _read_raw_record fix#50

Open
alexuser wants to merge 3 commits into
throwaway31265:mainfrom
alexuser:fix/pod5-cbor2-buffering-biometrics
Open

Add pytest tests for Pod 5 _read_raw_record fix#50
alexuser wants to merge 3 commits into
throwaway31265:mainfrom
alexuser:fix/pod5-cbor2-buffering-biometrics

Conversation

@alexuser
Copy link
Copy Markdown

@alexuser alexuser commented May 14, 2026

This PR adds a test file for and fixes a bug in the manual CBOR parser where tiny uint sequence numbers were raising .

The original PR #46 introduced the manual reader to avoid cbor2's 4096-byte buffering. The initial implementation only handled (4-byte uint) and (8-byte uint) seq encodings. In practice cbor2 usually encodes small ints as tiny uints (-), so would throw on any realistic file.

I updated to handle all CBOR uint additional-info cases: tiny uints, 8-bit, 16-bit, 32-bit, and 64-bit. I also added with 17 pytest cases covering:

  • normal records (small, >4k, exactly 4096 bytes, two back-to-back)
  • empty placeholder records returning
  • malformed input ()
  • truncated/EOF input ()
  • existing helpers still work (, , )

I had to mock out the logger and sentry initialization so the tests can import without trying to create .

Run them with:

pip install numpy scipy pandas cbor2 pytest
cd biometrics && python -m pytest __tests__/test_raw_record_parser.py -v

17 tests pass on my end.

Sean Passino and others added 3 commits March 14, 2026 09:49
…empty placeholder records break loop

Two bugs combine to produce zero biometric data on Pod 5 (and degrade
accuracy on Pod 3/4):

Bug 1: cbor2 C extension reads in 4096-byte chunks

When the _cbor2 C extension is installed, cbor2.load(f) advances f.tell()
by 4096 bytes per call regardless of actual record size. RAW file records
are 17-5000 bytes each, so only one record per 4096-byte block is ever
decoded — the rest are silently skipped. On Pod 5 with ~2700-byte
piezo-dual records, this skips the majority of data.

Fix: replace cbor2.load(f) with _read_raw_record(f), a manual parser that
uses f.read() byte-by-byte to parse the outer {seq: uint, data: bytes}
CBOR wrapper, keeping f.tell() accurate after every record.

Bug 2: Empty placeholder records raise EOFError mid-file

Pod 5 firmware writes placeholder records with data=b'' (CBOR 0x40 =
empty byte string) as sequence number markers between real records.
cbor2.loads(b'') raises CBORDecodeEOF, which is a subclass of EOFError.
Since both files catch EOFError to detect end-of-file and break the loop,
any placeholder record mid-file terminates the entire read early — even
with megabytes of valid data remaining.

Fix: _read_raw_record() returns None for empty data; callers continue.

Additionally, the original stream.py error handler called break on any
exception, leaving the file position at an unknown offset. Changed to
seek back to last_pos before breaking so recovery is possible.

Tested on Pod 5, cbor2 5.6.5 with _cbor2 C extension, Python 3.10.
Result: 846 vitals rows (HR, HRV, breathing rate) and 400 movement rows
recorded in a single night. Stream runs stably with no OOM kills.
…d tests

The manual CBOR parser only supported 0x1a and 0x1b seq headers, so it broke
when cbor2 emitted tiny uints (0x00-0x17) for small sequence numbers. Fix it to
handle all CBOR uint additional-info encodings.

Also add  covering normal
records, empty placeholders, malformed/truncated input, EOF propagation, and
that existing helpers still work (17 tests total).

Tested:  passes.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 14, 2026

@alexuser is attempting to deploy a commit to the david's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant