Skip to content

Sync 2.x into master: pipeline refactor (PR-A through PR-E2) and changelog#40

Open
ramonski wants to merge 15 commits into
masterfrom
sync-refactor-to-master
Open

Sync 2.x into master: pipeline refactor (PR-A through PR-E2) and changelog#40
ramonski wants to merge 15 commits into
masterfrom
sync-refactor-to-master

Conversation

@ramonski

Copy link
Copy Markdown
Contributor

Heads-up: This is the last PR that will be merged into master.
After this merge, 2.x will diverge from master to implement new
transport layers and broader architectural changes. Future work will
live on 2.x (and successor branches), not on master.

Summary

Brings master up to date with the pipeline refactor series merged on
2.x:

The head branch is pinned at the last refactor commit so the HL7 stack and the disk-capture / transport-split work that landed on 2.x afterwards (#33-#39) is intentionally excluded from this merge.

Test plan

  • bin/test --package senaite.astm passes
  • Replay corpus tests pass against the bundled fixtures
  • Spot-check that downstream consumers of Wrapper.to_dict() still
    work with the typed envelope

ramonski added 15 commits May 8, 2026 22:56
The package declares Python 3.8+ in setup.py, so the legacy compat
layer is dead code. Cleaning it up reduces noise for future readers
and removes a layer of indirection over plain str / int / bytes.

- Delete src/senaite/astm/compat.py (basestring, unicode, long,
  make_string, b, u, buffer)
- Replace compat imports with direct str / int / bytes usage
- Inline make_string into Field._set_value (decode bytes as utf-8,
  str() everything else)
- Replace try/except izip_longest with a direct
  'from itertools import zip_longest' in utils.py and mapping.py
- Replace try/except 'from collections import Iterable' with
  'from collections.abc import Iterable' in codec.py
- Replace deprecated logger.warn with logger.warning in utils.py
  and lims.py
- Drop the u() helper in tests/test_fields.py — Python 3 string
  literals are already unicode
Drop Python 2 compatibility shims
Bumps the package to 2.0.0 — the LIMS push API is intentionally
incompatible with the 1.x line.

The Session class now creates a single requests.Session in __init__
and reuses it across all calls, so the TLS handshake is amortised
across the connection rather than repeated for every request. auth(),
get() and post() raise typed exceptions (SenaiteAuthError,
SenaiteHTTPError, SenaiteUnreachableError) instead of swallowing
every Exception into an empty dict, so the caller can react to the
specific failure mode.

post_to_senaite() now authenticates once per call. Retries on push
failure only re-call session.post(); auth is not re-run on every
attempt as before. The function returns a PushResult dataclass
(success, attempts, last_error) so the server can act on the result
instead of fire-and-forget.

The top-level senaite.astm.lims module is removed in the same PR;
server.py, sender.py and the tests import from senaite.astm.core.lims
directly. We own all callers, so we don't keep a compat shim.
Lift LIMS push into core/ with typed errors and PushResult
Adds senaite.astm.core.envelope with a pydantic-based Envelope and
Metadata model. The envelope is now a pinned contract:

- ENVELOPE_VERSION = '1.0' is exposed in metadata.envelope_version
  on every output, so consumers can detect schema changes.
- Metadata declares the required keys (envelope_version, astm,
  lis2a) and accepts vendor extras (e.g. Roche c111's parsed
  sender component) via extra='allow'.
- The per-record buckets (H, P, O, R, C, M, L, Q) default to empty
  lists so the top-level shape is stable regardless of which
  record types a given instrument emits.
- Per-record dict shapes are intentionally left loose — that lives
  in the per-instrument record classes.

Wrapper now exposes to_envelope() returning the typed model and
keeps to_dict()/to_json() as JSON-serialisable convenience wrappers
around it. The 11 golden snapshots are regenerated to include the
envelope_version field and the empty-list defaults.

pydantic>=2 is added to install_requires.
Define a typed Envelope schema for Wrapper.to_dict()
Three behaviours bit us in production and made the test output
hard to read. This PR fixes the symptoms without rewriting the
descriptor framework.

NotUsedField no longer warns on assignment. The cobas_c311 fixture
alone produced ~78 UserWarning entries per parse, drowning out
real warnings without giving the operator anything actionable. The
field now silently drops the assigned value.

SetField accepts unknown values by default and logs them at debug
level. A device firmware update that introduces a new status code
should not crash parsing of every message that contains it. Pass
strict=True to restore the legacy raise-on-unknown behaviour.

DateField, TimeField and DateTimeField now accept a tuple of
parse_formats in addition to the canonical format. Subclasses can
extend it to handle vendor-specific date strings without rewriting
parse logic. The canonical format is always tried first and is
still used for serialisation, so existing snapshots are unchanged.
Make field descriptors quiet and tolerant
First step of unifying senaite.astm.instruments and
senaite.astm.adapters into a single mechanism.

- New core.instrument module:
  - Instrument base class (name, header_regex, record_map,
    can_handle, preparse, get_metadata)
  - register_instrument decorator with shape validation
  - find_instrument resolver that raises
    AmbiguousInstrumentError instead of silently picking one
    match when two regexes overlap
- Wrapper.get_mapping() now consults the registry first and
  falls back to today's pkgutil-based discovery. Instrument-
  specific metadata is merged via either path.

No instrument has been migrated yet, so behaviour is unchanged
for all existing analyzers. PR-E2 will migrate them and remove
the legacy discovery path together with senaite.astm.adapters.

269 tests pass (+10 new); flake8 clean.
Introduce the instrument registry (PR-E1)
Second and final step of the instrument unification:

- Every senaite.astm.instruments.* module now declares an
  Instrument subclass at the bottom and registers itself via
  @register_instrument. Module-level HEADER_RX is a compiled
  bytes regex; the old get_mapping()/get_metadata() helpers are
  gone. Each module also exposes INSTRUMENT for direct test
  access.
- The two zope-adapter data handlers (mini_vidas, spotchem se1520)
  have moved onto their corresponding Instrument via a new
  raw_data_regex attribute and a handle_raw_data(protocol, data)
  hook. ASTMProtocol.handle_data now dispatches via
  find_raw_data_handler instead of a Components registry.
- Wrapper.get_mapping resolves entirely through the registry and
  falls back to DEFAULT_MAPPING for unknown headers. The pkgutil
  iter_modules path is gone, along with self.module.
- senaite.astm.adapters, senaite.astm.interfaces, the
  adapter_registry global, and the zope.interface dependency are
  removed.
- instruments/__init__.py imports every submodule so the
  decorators run at import time.
- The Instrument base class also gained get_metadata; the wrapper
  now actually calls it (the old wrapper had latent code that
  never fired for any module). Envelope snapshots regenerated
  accordingly to include version + header_rx in metadata.
- New test_replay_corpus.py walks $ASTM_REPLAY_DIR (~50k CERMEL
  captures) through Wrapper and asserts the parse failure ratio
  stays under 5%. Pre-/post-migration ratios are identical at
  1549/50382 (~3.07%), confirming no real-traffic regression.
- Existing per-instrument tests updated to access
  <module>.INSTRUMENT.record_map.

270 tests pass (269 + replay). Existing pre-migration parser
quirks (mostly truncated c111 captures) are explicitly tolerated
by the replay threshold.
Migrate every instrument to the registry (PR-E2)
@ramonski ramonski added Enhancement ✨ Improvement to existing functionality Cleanup 🧹 Code cleanup and refactoring labels May 28, 2026
@ramonski ramonski requested a review from xispa May 28, 2026 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Cleanup 🧹 Code cleanup and refactoring Enhancement ✨ Improvement to existing functionality

Development

Successfully merging this pull request may close these issues.

1 participant