Skip to content

Latest commit

 

History

History
1343 lines (810 loc) · 35.1 KB

File metadata and controls

1343 lines (810 loc) · 35.1 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

Changes

Deprecated

Removed

Fixed

Security

[3.0.0] - 2026-06-19

Added

  • BREAKING: Model update v5.0: add optional fields to entity type models

Changes

[2.1.0] - 2026-06-17

Added

  • added function search_preview_items in backend_api connector for advanced reference filtering

Changes

Fixed

  • fix fuzzing tests

[2.0.1] - 2026-05-26

[2.0.0] - 2026-05-26

Added

  • BREAKING: Added workflow rule to all rule sets
    • it defines forbidden targets for publishing of merged items
    • This change affects what kind of data is stored in database and might therefore have unexpected side effects
    • If your repo depends on mex-common AND on mex-backend, make sure to update both to versions that include the workflow rule
  • new "is_item_publishable" function

Changes

Removed

  • Post endpoint for preview_merged_item

[1.19.0] - 2026-04-17

Added

  • add get_preview_item by stableTargetId to BackendAPIConnector
  • BREAKING add configuration parameter ops_dir. Settings are now read from ops_dir/config/.env and ops_dir/config/secrets/*. Make sure your environment variable MEX_OPS_DIR points to your local mex-ops directory. For dependent repositories: change the type of your Settings parameters that point to migrated files (e.g. certificates) to OpsPath.

[1.18.1] - 2026-04-01

Changes

[1.18.0] - 2026-03-25

Added

  • fuzzing test for PreviewItem generation using mex-artificial

Changes

  • verify ldap server certificate

Fixed

  • adding multiple OptionalValues via an AdditiveRule no longer breaks creation of PreviewItem

[1.17.0] - 2026-03-17

Changes

[1.16.1] - 2026-03-02

Changes

  • expand ALL_REFERENCE_FIELD_NAMES to include additive reference fields (supersededBy)

[1.16.0] - 2026-02-19

Added

  • expose nested level implementations on the higher level module
  • added additional model lookups
    • RULE_MODEL_CLASSES_BY_TYPE_BY_NAME
    • EXTRACTED_AND_RULE_MODEL_CLASSES_BY_NAME
    • SEARCHABLE_FIELDS
    • SEARCHABLE_CLASSES
    • NESTED_ENTITY_TYPES_BY_FIELD_BY_CLASS_NAME
    • NESTED_ENTITY_TYPES_BY_CLASS_NAME
    • REFERENCED_ENTITY_TYPES_BY_FIELD_BY_CLASS_NAME
    • STRINGIFIED_TYPES_BY_FIELD_BY_CLASS_NAME
    • REFERENCED_ENTITY_TYPES_BY_CLASS_NAME
    • REFERENCED_FIELD_REFERENCING_TUPLES
    • INBOUND_REFERENCE_FIELDS_BY_CLASS_NAME
    • ALL_REFERENCE_FIELD_NAMES
  • added utils camelcase_to_title and clean_dict
  • added missing methods to backend_api connector
    • system_status
    • flush_graph
    • delete_rule_set
    • delete_merged_item
    • match_item
  • add mex-editor identifiers: identifier, stableTargetId and identifierInPrimarySource

Fixed

  • explicit export fo mex-editor identifiers

[1.15.0] - 2026-02-18

Added

  • add mex-editor identifiers: identifier, stableTargetId and identifierInPrimarySource

Changes

  • replace pandas Series with plain dict in parse_csv

[1.14.0] - 2026-02-13

Added

  • support for python 3.14
  • settings classes log table representation in model validation hook
  • PathWrapper can be compared to regular pathlib Path objects

Changes

Deprecated

  • deprecated settings_cls keyword for entrypoint decorator

Removed

  • removed all cli arguments for entrypoints, except --pdb

[1.13.2] - 2026-01-30

Added

  • added LDAPBackendAPIConnector to use 'merged-person-from-login'

[1.13.1] - 2026-01-23

Changes

Fixed

  • updated mex-model to 4.9.1

[1.13.0] - 2026-01-22

Added

  • support for python 3.11 - 3.13

Changes

[1.12.1] - 2026-01-20

Added

  • transform_ldap_person_and_unit_ids_to_extracted_person for extractors ldap convenience function

[1.12.0] - 2026-01-06

Added

  • added field descriptions according to mex.model
  • added two new string helper functions split_to_camel and camel_to_split
  • added merged model verification to test_model_schemas
  • add 'supersededBy' to merged and preventive items, and to additive rules

Changes

  • update mex-model to version 4.8 featuring translated entity types and repo maintenance
  • updated template to https://github.com/robert-koch-institut/mex-template/commit/2039340
  • updated template to https://github.com/robert-koch-institut/mex-template/commit/c5ff3e
  • BREAKING: expects organization ID instead of organization in:
    • transform_organigram_unit_to_extracted_organizational_unit
    • transform_organigram_units_to_organizational_units
  • update mex-model to version 4.7 where "supersededBy" is added to merged items
  • use model_title_generator instead of hardcoding model titles
  • move hadPrimarySource and identifierInPrimarySource from concrete mapping models to the base mapping class
  • update mex-model to version 4.6 where extracted and merged schemas are split
  • update test_model_schemas to be more readable and remove unnecessary special cases
  • move settings for ldap search base

[1.11.0] - 2025-12-04

Added

  • add merged_person_from_login method to BackendApiConnector
  • RestrictedTextLanguage allowing only EN or DE if confidence >=0.75

Changes

[1.10.0] - 2025-11-20

Changes

[1.9.0] - 2025-11-03

Added

  • add organigram convenience function to get extracted units

Changes

[1.8.2] - 2025-10-28

Fixed

  • fix TemporalEntity equality check
  • fix DistributionRuleSetResponse entity type

[1.8.1] - 2025-10-22

Changes

  • update mex-model dependency to 4.4

[1.8.0] - 2025-10-22

Changes

  • move topological_sort from extractors to common
  • better align mex.common.models with mex-model
  • update mex-model dependency to 4.3
  • parse_csv now summarizes error logs in batches with amount of successful validations.

[1.7.0] - 2025-10-17

Changes

  • BREAKING: expects primary source id instead of primary source in:
    • get_merged_ids_by_employee_ids
    • get_merged_ids_by_query_string
    • transform_ldap_persons_to_extracted_persons
    • transform_ldap_functional_accounts_to_extracted_contact_points
    • transform_ldap_persons_with_query_to_extracted_persons
    • transform_ldap_person_to_extracted_person
    • transform_ldap_functional_account_to_extracted_contact_point
    • transform_any_ldap_actor_to_extracted_persons_or_contact_points
    • transform_orcid_person_to_mex_person
    • get_extracted_organizational_unit_with_parents
    • transform_organigram_unit_to_extracted_organizational_unit
    • transform_organigram_units_to_organizational_units
    • transform_wikidata_organizations_to_extracted_organizations
    • transform_wikidata_organization_to_extracted_organization
  • BREAKING: returns primary source id dict instead of primary source dict in extracted_primary_sources
  • BREAKING: email changed typing to annotated str for ContactPoint, OrgUnit and Person

Removed

  • BREAKING: get_primary_sources_by_name
  • BREAKING: EMAIL_PATTERN and Email removed from mex.common.types
  • BREAKING: EMAIL_FIELDS_BY_CLASS_NAME removed from mex.common.fields

[1.6.0] - 2025-10-16

Changes

  • updated pydantic to 2.12.2 and pydantic settings to 2.11

[1.5.1] - 2025-10-07

Added

  • Settings validator to prevent attributes that inherit from BaseSettings

Changes

  • don't require env settings for ldap and wiki tests
  • update mex-model to 4.2.1

[1.5.0] - 2025-09-30

Added

  • bumped pytz to newest version >=2025
  • BREAKING: add RKI organization as affiliation to ldap persons

Fixed

  • fixed temporal entity tests for newest pytz version

[1.4.0] - 2025-09-11

Added

  • added fr, es and ru as selectable languages to Link and Text

Changes

  • make LDAPConnector filter options snake_case and keyword-only

Removed

  • removed unused filter options from LDAPConnector

Fixed

  • fix ldap functional account and user search
  • fix error for getting unknown wiki item

[1.3.0] - 2025-09-02

Added

  • add connector function and transformer to get persons and contact points in one go
  • add deprecated util to simplify renaming of functions
  • add REQUIRED_FIELDS_BY_CLASS_NAME and TEMPORAL_PRECISIONS_BY_FIELD_BY_CLASS_NAMES
  • add models for Status and VersionStatus

Changes

  • change BackendApiConnector.search_person_in_ldap to allow contact points

Deprecated

  • deprecated multiple ldap functions to be replaced by more precise naming

Removed

  • remove redundant function get_ldap_persons to use connector directly instead
  • remove unused functions get_merged_ids_by_email and get_ldap_fields

[1.2.0] - 2025-08-28

Added

  • add helper function to find organigram unit descendants
  • have ndjson sink handle special characters correctly

[1.1.0] - 2025-08-25

Added

  • add ensure_list utility function

Changes

  • BREAKING: change the behavior of merge previews to include blocked values

[1.0.0] - 2025-08-21

Added

  • add Validation enum to mex.common.types

Changes

  • BREAKING: change validation argument of create_merged_item helper
  • BREAKING: change model hash and string computation to faster black2/pickle

Removed

  • BREAKING: remove unused BaseModel.checksum method

[0.65.0] - 2025-07-25

Added

  • add identifier filter to merged and preview fetch endpoints in BackendApiConnector
  • add fetch_all_merged_items and get_extracted_item methods to BackendApiConnector

Changes

  • BREAKING: replace hadPrimarySource with reference field filter in BackendApiConnector
  • BREAKING: BackendApiConnector fetch endpoint methods now expect keyword arguments
  • bump cookiecutter template to e886ec

[0.64.0] - 2025-07-24

Changes

  • use default wiki label should there be neither a german nor english label

[0.63.0] - 2025-07-10

Added

  • BREAKING: add RKI organization as unitOf to organigram units

[0.62.2] - 2025-07-08

Fixed

  • fix ldap connection resetting

[0.62.1] - 2025-07-07

Added

  • moved contains_any_types over from mex-backend

Changes

  • ensure extracted items are merged in predictable way
  • improve a batch of doc-strings with args, raises and return sections

Fixed

  • fix ldap error if ldap connector is >1h old

[0.62.0] - 2025-06-17

Added

  • running github release action publishes to pypi

Changes

  • use mex-model from pypi instead of github

[0.61.2] - 2025-06-13

Changes

  • bump mex-model dependency
  • get vocabulary model examples dynamically from models

[0.61.1] - 2025-05-19

Changes

  • improve logging for backend API sink connector

Fixed

  • fix response type of BackendApiConnector.preview_merged_item to be PreviewModel

[0.61.0] - 2025-05-16

Added

  • add proportional backoff to too_many_requests responses and configure minimum time

Changes

  • move hash function from ExtractedData to BaseModel
  • added fullname attribute to transforming orcid person to mex person

Removed

  • removed BaseEntity class to reduce inheritance hierarchy
  • remove http retry on forbidden responses

[0.60.0] - 2025-05-13

Added

  • added MEX_BACKEND_API_PARALLELIZATION and MEX_BACKEND_API_CHUNK_SIZE settings
  • added support for sending batches of data to the backend in parallel

Changes

  • bump cookiecutter template to ed5deb

[0.59.2] - 2025-05-12

Changes

  • update pyarrow and click dependencies
  • update mex-model to 3.6.1

[0.59.1] - 2025-04-29

Changes

  • update mex-model to 3.5.7

Fixed

  • fix ldap connector method signatures

[0.59.0] - 2025-04-29

Added

  • added BackendApiConnector methods to create and update rule-sets
  • added BackendApiConnector methods to search in wiki, ldap and orcid
  • added BackendApiConnector methods to fetch and assign identities
  • added limit parameter to ldap connector and helper functions
  • added pre-configured type adapters to all models with an entityType
  • added MEX_LDAP_SEARCH_BASE setting to configure the search domain
  • added metrics method to connectors to collect cache hits and misses
  • added get_extracted_organizational_unit_with_parents organigram helper
  • added transform_organigram_unit_to_extracted_organizational_unit transformer

Changes

  • BREAKING: move MergedModel and RuleSetResponse type adapters to models module
  • de-coupled BackendApiIdentityProvider from BackendApiConnector
  • moved memory and backend identity provider registration to package init file
  • BREAKING: convert ldap connector and related functions from generator to list returns
  • reconfigure all cached functions to have a maxsize setting
  • moved get_alias_lookup, get_list_field_names and get_field_names_allowing_none from BaseModel class to utils module
  • BREAKING: convert organigram functions from generator to list returns
  • BREAKING: convert primary-source functions from generator to list return
  • BREAKING: moved split_to_caps from types to transform module
  • BREAKING: moved normalize from utils to transform module
  • BREAKING: renamed get_persons_by_name to get_ldap_persons

Removed

  • BREAKING: remove return value from BackendApiConnector.ingest
  • BREAKING: remove unused LDAPConnector.get_unit and get_units methods
  • BREAKING: remove get_count_of_found_persons_by_name to avoid duplicate queries
  • BREAKING: removed member_of validation for ldap persons
  • BREAKING: removed get_all_extracted_primary_sources helper

[0.58.3] - 2025-04-24

Added

  • Also allow raw request of HTTPConnector

[0.58.2] - 2025-04-22

Changes

  • increase read timeout limit for BackendApiSink

[0.58.1] - 2025-04-17

Changes

  • log model info on sink errors

[0.58.0] - 2025-04-11

Changes

  • move ingest timeout configuration from BackendAPIConnector to BackendApiSink
  • wrap read time out errors of http connector in a custom timed error
  • add proportional backoff to http connector: the longer it took, the longer we chill
  • use watch decorator on sinks to only log once every 1000 write ops

[0.57.0] - 2025-04-09

Added

  • add pagination to orcid connector search method
  • add caching to orcid single item lookup
  • add support for multiple emails to orcid transform

Changes

  • BREAKING: dissolve aux-extractor model folders into single files
  • BREAKING: clean-up orcid connector method and extract function names
  • BREAKING: require orcid primary source as parameter to transform function
  • make orcid family and given names optional to validate all data
  • rename private _get_organization_details to public get_wikidata_organization

Removed

  • drop stale DataType type
  • remove unused LDAPConnector.PAGE_SIZE
  • BREAKING: removed unused MEX_WIKI_QUERY_SERVICE_URL setting
  • BREAKING: removed unused WikidataQueryServiceConnector class
  • BREAKING: removed unused search_organization_by_label extract function
  • BREAKING: removed unused get_count_of_found_organizations_by_label extract function
  • BREAKING: removed unused search_organizations_by_label extract function
  • BREAKING: removed unused get_extracted_organization_from_wikidata helper function

[0.56.1] - 2025-03-31

Added

  • ALL_TYPES_BY_FIELDS_BY_CLASS_NAMES and VOCABULARIES_BY_FIELDS_BY_CLASS_NAMES lookups

Changes

  • update ruff and apply TC006 fixes

[0.56.0] - 2025-03-27

Added

  • extract method for orcid can now obtain multiple results (orcid records)

[0.55.0] - 2025-03-21

Changes

  • BREAKING: wrap function around watch decorator accepting log_interval parameter
  • increase parse_csv default chunksize to 10000 and log chunks instead of rows

[0.54.4] - 2025-03-20

Changes

  • reduce chunk size for backend api sink to avoid timeouts

[0.54.3] - 2025-03-18

Fixed

  • stop stringifying backend identity provider url parameters

[0.54.2] - 2025-03-05

Fixed

  • remove unsafe pytest import from testing plugin

[0.54.1] - 2025-02-28

Changes

  • BREAKING: mex-model to 3.5.6: make doi pattern more lenient

[0.54.0] - 2025-02-27

Changes

  • BREAKING: mex-model to 3.5.5: update patterns and examples for URL fields

[0.53.0] - 2025-02-26

Added

  • BREAKING: filter for had_primary_source to backend api connector

[0.52.2] - 2025-02-24

Fixed

  • temporarily pin pydantic settings version to bypass error in settings update.

[0.52.1] - 2025-02-24

[0.52.0] - 2025-02-19

Added

  • Connector class for retrieving ORCID data by ID or name
  • methods for extracting data from orcid
  • methods to transform from OrcidPerson to mex person
  • model class for orcid data
  • unit tests for orcid connector

[0.51.1] - 2025-02-13

Fixed

  • expand the pattern for DOI Urls as these can also contain lowercase letters

[0.51.0] - 2025-02-11

Added

  • add entry for s3 to sink settings enum

Changes

  • add AnyMergedModel to the allowed types for Sink.load methods
  • but let BackendApiSink throw an error, when merged items are loaded
  • make local typevars private and give them speaking names

[0.50.0] - 2025-02-06

Changes

  • BREAKING: move ItemsContainer and PaginatedItemsContainer to mex.common.models
  • BREAKING: replace post_extracted_items with ingest and allow AnyRuleSetResponses
  • allow AnyRuleSetResponses as arguments to sinks
  • BREAKING: sinks now yield the models they loaded, instead of just their identifiers

[0.49.3] - 2025-01-29

Changes

  • update mex-model to 3.5.1

Fixed

  • fix regex pattern for GndIdStr in organization models

[0.49.2] - 2025-01-29

Fixed

  • do not wrap field types in setValues in mapping rules in another list

[0.49.1] - 2025-01-29

Fixed

  • reduce Filter classes to a single list field of FilterField items

[0.49.0] - 2025-01-29

Added

  • new (partially generic) classes for defining Mapping and Filter fields and rules

Changes

  • BREAKING: replaced dynamic Mapping and Filter classes with static ones

Deprecated

  • use FILTER_MODEL_CLASSES_BY_NAME instead of FILTER_MODEL_BY_EXTRACTED_CLASS_NAME
  • use MAPPING_MODEL_CLASSES_BY_NAME instead of MAPPING_MODEL_BY_EXTRACTED_CLASS_NAME

[0.48.0] - 2025-01-28

Added

  • add a sink registry with register_sink and get_sink functions
  • add a multi-sink implementation, akin to mex.extractors.load

Changes

  • BREAKING: convert post_to_backend_api to BackendApiSink
  • BREAKING: convert write_ndjson to NdjsonSink
  • backend and ndjson sinks log progress only in batches
  • increase timeout and decrease chunk size for backend API sink
  • port backend identity provider implementation from editor/extractors to common
  • allow backend and graph as identity provider setting to simplify setting subclasses, even though graph is not implemented in mex-common
  • BREAKING: make backend api connector response models generic, to keep DRY

[0.47.1] - 2025-01-24

Fixed

  • skip None values when merging extracted and rule items

[0.47.0] - 2025-01-23

Added

  • merging logic to mex-common

[0.46.0] - 2025-01-09

Added

  • BREAKING: add nested models (Text, Link) to all lookups in mex.common.fields

Changes

  • BREAKING: move GenericFieldInfo from models.base.field_info to utils
  • BREAKING: move get_all_fields from BaseModel to utils to support all base models

[0.45.0] - 2024-12-18

Changes

  • BREAKING: change type of distribution.title to an array of texts

[0.44.0] - 2024-12-12

Changes

  • updated ldap search from name and familyname to one single attribute "displayname"

[0.43.0] - 2024-12-10

Added

  • add preview models for merged items without cardinality validation
  • BREAKING: preview models are now part of all mex.common.fields lookups
  • add BackendApiConnector.fetch_preview_items for fetching previews

Deprecated

  • stop using ExtractedData, use AnyExtractedModel instead
  • stop using MergedItem, use AnyMergedModel instead
  • stop using AdditiveRule, use AnyAdditiveRule instead
  • stop using SubtractiveRule, use AnySubtractiveRule instead
  • stop using PreventiveRule, use AnyPreventiveRule instead
  • stop using BaseEntity, use a concrete union instead

Removed

  • removed deprecated BulkInsertResponse as alias for IdentifiersResponse
  • removed unused module export of mex.common.models.generate_entity_filter_schema
  • removed unused module export of mex.common.models.generate_mapping_schema
  • drop export models.ExtractedPrimarySourceIdentifier, import from types instead
  • drop export models.MergedPrimarySourceIdentifier, import from types instead

[0.42.0] - 2024-12-02

Added

  • add vocabulary and temporal unions and lookups to mex.common.types
  • add mex.common.fields with field type by class name lookups

Changes

  • wikidata helper now optionally accepts wikidata primary source
  • set default empty rules to all of the rule-set models
  • pin pydantic to sub 2.10 (for now) because of breaking changes

Fixed

  • switch HTTP method for preview endpoint to POST
  • add optional values to variadic values for distribution models
  • make endpointDescription optional for variadic access platform models

[0.41.0] - 2024-11-18

Added

  • organigram extraction checks for duplicate emails/labels in different organigram units

Changes

  • upgrade mex-model dependency to version 3.2

[0.40.0] - 2024-10-28

Changes

  • upgrade mex-model dependency to version 3.1

Fixed

  • fix typo in repositoryURL of bibliographic resources
  • make identifier and stableTargetId of ExtractedBibliographicResource computed fields

[0.39.0] - 2024-10-28

Added

  • added new consent and bibliography reference models and vocabs
  • added doi field to resource models
  • helper function for primary source look up

Changes

  • upgrade mex-model dependency to version 3
  • make ruff linter config opt-out, instead of opt-in
  • make instances of extracted data hashable
  • BREAKING: Wikidata convenience function refactored and renamed to 'helper'
  • wikidata helper function split between mex-common and mex-extractors
  • code de-duplication: fixture extracted_primary_sources uses function-part of helper
  • split up YearMonth and Year temporal types and improved patterns
  • applied all changes to model fields according to model v3
  • update LOINC pattern

Fixed

  • fix temporal entity schemas

[0.38.0] - 2024-10-11

Added

  • add pattern constants for vocabs, emails, urls and ids to types module
  • add regex pattern to json schema of identifier fields
  • automatically add examples and useScheme to json schema of enum fields

Changes

  • BREAKING: use identifier instead of stableTargetId to get merged item from backend
  • ensure identifier unions are typed to generic Identifier instead of the first match to signal that we don't actually know which of the union types is correct
  • unify pydantic schema configuration for all types
  • consistently parse emails, identifiers and temporals in models to their type, not str
  • consistently serialize emails, ids and temporals in models to str, not their type
  • make instances of Link type hashable, to harmonize them with Text models

Removed

  • drop manual examples from enum fields, because they are autogenerated now
  • BREAKING: remove MEX_ID_PATTERN from types, in favor of IDENTIFIER_PATTERN
  • BREAKING: make public MEX_ID_ALPHABET constant from identifier module private
  • BREAKING: remove __str__ methods from Text and Link classes
  • BREAKING: drop support for parsing UUIDs as Identifiers, this was unused
  • BREAKING: drop support for parsing Links from markdown syntax, this was unused
  • BREAKING: remove pydantic1-style validate methods from all type models
  • BREAKING: BackendApiConnector.post_models in favor of post_extracted_items

[0.37.0] - 2024-10-01

Added

  • added methods for extracting persons by name or ID from ldap
  • contains_only_types to check if fields are annotated as desired
  • group_fields_by_class_name utility to simplify filtered model/field lookups
  • new parameters to get_inner_types to customize what to unpack

[0.36.1] - 2024-09-16

Fixed

  • pin pytz to 2024.1, as stopgap for MX-1703

[0.36.0] - 2024-09-09

Added

  • added BackendApiConnector methods to cover all current (and near future) endpoints: fetch_extracted_items, fetch_merged_items, get_merged_item, preview_merged_item and get_rule_set
  • complete the list of exported names in models and types modules

Deprecated

  • deprecated BackendApiConnector.post_models in favor of post_extracted_items

Removed

  • containerize section from release pipeline

Fixed

  • added the rki/mex user-agent to all requests of the HTTPConnector

[0.35.0] - 2024-08-20

Changes

  • update cruft and loosen up pyproject dependencies
  • harmonize signatures/docs of pydantic core/json schema manipulating methods

Fixed

  • fix schema tests not starting with diverging model names in common and mex-model
  • fix serialization for temporal entity instances within pydantic models

[0.34.0] - 2024-08-12

Added

  • wikidata fixtures to pytest plugin: wikidata_organization_raw, wikidata_organization, mocked_wikidata
  • convenience function get_merged_organization_id_by_query_with_extract_transform_and_load for getting the stableTargetId of an organization, while transforming and loading the organization using the provided load function
  • models for rule-set requests and responses along with typing and lookups
  • add BaseT models to the exported names of mex.common.models
  • add MEX_ID_PATTERN to the exported names of mex.common.types

Changes

  • move all base models and pydantic scaffolding into mex.common.models.base for a cleaner structure within the growing models module

[0.33.0] - 2024-07-31

Added

  • HTTP connector backoff for 10 retries on 403 from server
  • rki/mex user agent is sent with query requests via wikidata connector

Changes

  • changed backend api connector payload to "items"

  • update wikidata search organization request query, with optional language parameter wikidata query search can be enhanced by specifying the language. EN is the default language.

[0.32.0] - 2024-07-23

Changes

  • move log timestamp and coloring into the formatter

Deprecated

  • mex.common.logging.echo is deprecated in favor of logging.info

Fixed

  • add missing listyness-fix support for computed-fields

[0.31.0] - 2024-07-17

Removed

  • BREAKING: ability to store different settings instances at the same time. Dependent repositories now must bundle all settings in a single class.

[0.30.0] - 2024-07-16

Added

  • get count of found wikidata organizations

[0.29.1] - 2024-07-15

[0.29.0] - 2024-07-12

Added

  • add validator to base model that verifies computed fields can be set but not altered
  • new class hierarchy for identifiers: ExtractedIdentifier and MergedIdentifier

Changes

  • improve typing for methods using Self
  • make local type variables private
  • use json instead of pickle to calculate checksum of models
  • replace set_identifiers validator with computed fields on each extracted model

Removed

  • removed custom stringify method on base entities that included the identifier field

Fixed

  • fix typing for __eq__ arguments

[0.28.0] - 2024-07-08

Added

  • extract multiple organizations from wikidata

[0.27.1] - 2024-06-14

Changes

  • update mex-model to version 2.5

[0.27.0] - 2024-06-10

Added

  • add static class attribute stemType to models, containing an unprefixed entityType
  • add AnyRuleModel, RULE_MODEL_CLASSES, RULE_MODEL_CLASSES_BY_NAME to models
  • add type aliases AnyPrimitiveType and LiteralStringType to types
  • add new utility function ensure_postfix for adding postfixes to strings

Changes

  • clean-up and unify mapping and filter class generation

[0.26.1] - 2024-05-29

Fixed

  • fix memory identity provider seeding

[0.26.0] - 2024-05-29

Added

  • add classes for Additive, Preventive and Subtractive rules for all entity types
  • add types, lists and lookups for all three rule types to mex.common.models

Changes

  • move aux-extractor documentation from readme to __init__ to have it in sphinx
  • move BaseModel specific descriptions from class to model to avoid duplication
  • BREAKING: move FILTER_MODEL_BY_EXTRACTED_CLASS_NAME to mex.common.models
  • BREAKING: move MAPPING_MODEL_BY_EXTRACTED_CLASS_NAME to mex.common.models
  • BREAKING: change MEX_PRIMARY_SOURCE_IDENTIFIER to end in 1, so that it differs from MEX_PRIMARY_SOURCE_STABLE_TARGET_ID

[0.25.1] - 2024-05-21

Fixed

  • isolate settings context before first test

[0.25.0] - 2024-05-14

Added

  • add precision keyword to TemporalEntity constructor
  • add transform function for single wikidata organization to extracted organization

Changes

  • add tests for ldap.extract

Fixed

  • fix ldap.extract.get_merged_ids_by_email

[0.24.0] - 2024-04-12

Added

  • synchronize changes to fields in BaseSettings to all active settings subclasses
  • added github action for renovatebot

Changes

  • make memory identity provider deterministic (same input args results in same stableTargetId and Identifier)
  • rework ContextStore into SingletonStore with more intuitive API
  • phase out ambiguous "context" naming in favor of more descriptive "singleton store"
  • rename SettingsContext to SETTINGS_STORE and allow multiple active subclasses
  • rename ConnectorContext to CONNECTOR_STORE removing its context manager functions
  • replace reset_connector_context() with more consistent CONNECTOR_STORE.reset()

Removed

  • removed types IdentifierT, SettingsType, ConnectorType in favor of typing.Self
  • remove github dependabot configuration

[0.23.0] - 2024-04-08

Changes

  • return only one org from wikidata, if multiple or no org is found then return None
  • filter quotation marks (") from requested wikidata label

[0.22.0] - 2024-03-19

Added

  • port get_inner_types from mex-backend to mex.common.utils

Changes

  • rename Timestamp class to TemporalEntity
  • added subclasses with specific resolution YearMonth, YearMonthDay, YearMonthDayTime
  • modernize typing with syntactic sugar
  • simplify BaseModel._get_list_field_names using get_inner_types
  • switch from poetry to pdm
  • use vocabulary JSON files from mex-model

Removed

  • remove vocabulary JSON files

Fixed

  • date and time validation working and harmonized with mex-model

[0.21.0] - 2024-03-04

Added

  • add entityType type hint to MExModel (now BaseEntity)
  • add types for AnyBaseModel, AnyExtractedModel and AnyMergedModel
  • create more specific subclasses of Identifier (for extracted and merged)
  • expose unions, lists and lookups for Identifier subclasses in mex.common.types

Changes

  • swap contextvars.ContextVar for mex.common.context.ContextStore
  • move stableTargetId property from base models to extracted models
  • update typing of identifiers to specific subclasses
  • use Annotated[..., Field(...)] notation for pydantic field configs
  • split up mex.common.models.base and move out MExModel and JsonSchemaGenerator
  • rename MExModel to BaseEntity with only type hints an model config
  • declare hadPrimarySource, identifier and identifierInPrimarySource as frozen

Removed

  • absorb unused BaseExtractedData into ExtractedData
  • remove stableTargetId property from merged models
  • drop support for sinks to accept merged items (now only for extracted data)

[0.20.0] - 2024-02-22

Changes

  • update cruft and dev dependencies
  • randomize test order by default

Removed

  • remove mex.common.public_api module and the correlating sinks
  • remove PathWrapper.resolve and PathWrapper.raw methods

Fixed

  • remove pytest.mark from fixture in mex.common.testing.plugin

[0.19.4] - 2024-02-15

Changes

  • update cruft and minor dependencies

Removed

  • date-time format validation for mapping model generation

[0.19.3] - 2024-02-06

Changes

  • update cruft to apply new workflow trigger config
  • update poetry and pre-commit dependencies

Fixed

  • fix mex mapping model name

[0.19.2] - 2024-02-02

Added

  • pytest plugins for random order and parallelized test execution
  • move dynamic mapping model generation from mex-assets

Changes

  • mex.bat test uses random order and xdist plugins by default

[0.19.1] - 2024-01-19

Added

  • cruft template link
  • workflow that syncs main branch to openCoDE
  • constant for MEX_PRIMARY_SOURCE_IDENTIFIER

Changes

  • harmonized boilerplate

Fixed

  • ExtractedData raises proper ValidationError when parsing wrong base type

[0.19.0] - 2024-01-12

Added

  • add entityType field in all extracted and merged models

Fixed

  • wikidata test

[0.18.2] - 2024-01-11

Added

  • CHANGELOG.md documenting notable changes to this project
  • a template for pull requests
  • language french in language vocabulary

[0.18.1] - 2024-01-03

Added

  • tests for mex.common.types.PathWrapper
  • method is_relative to mex.common.types.PathWrapper to check whether the path is relative

Changes

  • resolve base paths of work/assets path fields in settings

Fixed

  • nesting of mex.common.types.PathWrapper on instantiation

[0.18.0] - 2023-12-20

Changes

  • move Sink and IdentityProvider to mex.common.types

Deprecated

  • deprecate MExModel.get_entity_type, use cls.__name__ instead
  • deprecate mex.common.models.MODEL_CLASSES[_BY_ENTITY_TYPE], use the more precise lists or dicts like EXTRACTED_MODEL_CLASSES_BY_NAME instead

[0.17.1] - 2023-12-20

Added

  • use dmypy for pre-commit type checking

Fixed

  • fix previously undetected typing issue

Changed

  • configure CI linting to install poetry
  • update versions