Skip to content

Add include_spans option to find_iocs (#291)#367

Open
fhightower wants to merge 1 commit into
mainfrom
fhightower-ioc-finder-291-feature-request-give-option-to-return-found-start-and-end-index-of-ioc-claude
Open

Add include_spans option to find_iocs (#291)#367
fhightower wants to merge 1 commit into
mainfrom
fhightower-ioc-finder-291-feature-request-give-option-to-return-found-start-and-end-index-of-ioc-claude

Conversation

@fhightower

Copy link
Copy Markdown
Owner

Summary

  • New include_spans=True flag on find_iocs returns dict[ioc_type, dict[value, list[(start, end)]]] with offsets into the original input text (pre-fanging).
  • Fanged IOCs are recovered via a SequenceMatcher-based offset map, so example[.]com surfaces as example.com with a span covering the bracketed original substring.
  • Scope is limited to DEFAULT_IOC_TYPES for this PR; passing any other type with include_spans=True raises ValueError.

Notes

Test plan

  • make lint (ruff + mypy) clean
  • Full pytest suite: 242 passed, 99.73% coverage (above 99% threshold)
  • New tests/test_find_iocs_spans.py covers each default IOC type, fanged input ([.], hxxp), multi-occurrence dedup-with-spans, URL trim via _clean_url, every non-default boolean branch, and the validation error path

🤖 Generated with Claude Code

Returns dict[ioc_type, dict[value, list[(start, end)]]] with offsets
into the original (pre-fanging) input. Spans are recovered through a
SequenceMatcher-based offset map, so e.g. example[.]com surfaces as
example.com pointing at the original bracketed substring.

Limited to DEFAULT_IOC_TYPES in this PR; non-default types passed
alongside include_spans=True raise ValueError. Percent-decoding inside
URL paths is skipped in span mode (would shift downstream offsets).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Give option to return found start and end index of IoC

1 participant