Skip to content

feat: integrate relation extraction #4010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

SoniaBadene
Copy link
Member

@SoniaBadene SoniaBadene commented May 20, 2025

Proposed changes

  • Updated webservice call to use /extract_entities_relations endpoint with with_relations=true param.
  • Extended parsing logic to handle relationships from ML model response.
  • Tracked text-to-STIX ID mapping for relationship linking.
  • Added support for Identity types (Sector, Organization, Individual) and custom object Channel.
  • Created STIX relationships dynamically from predicted data when valid source/target IDs are found.
  • Preserved original fallback logic for known relationship rules (simplified the code).

TODO:

  • for perf, maybe avoid creating duplicate Relationship objects (rel_type, src_id, dst_id), so to have fewer instantiated Python objects, and if we need for later (text highlights) keep every textual mention by enriching the bundle with Note/Observed-Data ..

Related issues

@SoniaBadene SoniaBadene linked an issue May 20, 2025 that may be closed by this pull request
@SoniaBadene SoniaBadene self-assigned this May 20, 2025
@SoniaBadene SoniaBadene added this to the Release 6.7.0 milestone May 20, 2025
@SoniaBadene SoniaBadene marked this pull request as draft May 20, 2025 10:31
@gregoirelafay gregoirelafay force-pushed the feature/3986-integrate-relationship-prediction branch 2 times, most recently from d7fcf8c to dfcb1c1 Compare May 20, 2025 11:52
@gregoirelafay
Copy link
Member

There is a typing issue related to self.file being implicitly defined as None in the __init__. We should initialize it there with self.file: dict[str, Any] | None = None

@SoniaBadene SoniaBadene force-pushed the feature/3986-integrate-relationship-prediction branch 2 times, most recently from 87067ad to c4d2947 Compare May 22, 2025 09:40
@gregoirelafay gregoirelafay self-requested a review May 23, 2025 08:18
@SoniaBadene SoniaBadene force-pushed the feature/3986-integrate-relationship-prediction branch from 0db7708 to d637328 Compare May 23, 2025 14:16
SoniaBadene and others added 4 commits May 26, 2025 17:03
- Updated webservice call to use `/extract_entities_relations` endpoint
with `with_relations=true` param.
- Extended parsing logic to handle `relationships` from ML model
response.
- Tracked text-to-STIX ID mapping for relationship linking.
- Added support for Identity types (Sector, Organization, Individual)
and custom object Channel.
- Created STIX relationships dynamically from predicted data when valid
source/target IDs are found.
- Preserved original fallback logic for known relationship rules.
@SoniaBadene SoniaBadene force-pushed the feature/3986-integrate-relationship-prediction branch from d637328 to 593148c Compare May 26, 2025 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ImportDocumentAI] Add model-based relationship prediction
2 participants