Skip to content

[BUG]filter_words raises RuntimeWarning instead of ValueError on empty input #629

@Ishiezz

Description

@Ishiezz

Describe the bug

filter_words in pyaptamer/utils/_base.py silently emits a RuntimeWarning
when called with an empty dictionary, instead of raising a clear ValueError.

This is dangerous because filter_words is called directly in production code
at pyaptamer/aptatrans/_pipeline.py line 143:

prot_words = filter_words(prot_words)

If a user passes an empty prot_words dict to the AptaTrans pipeline, execution
silently continues with an empty vocabulary — no crash at the point of bad input,
leading to mysterious failures downstream.

To Reproduce

from pyaptamer.utils._base import filter_words

filter_words({})
# RuntimeWarning: Mean of empty slice
# RuntimeWarning: invalid value encountered in scalar divide
# silently returns {} — no error raised

Expected behavior

An empty words dict should raise a ValueError with a clear message,
consistent with how other functions in the codebase handle invalid inputs
(e.g., rna2vec, MCTS, PSeAAC all raise ValueError for bad inputs).

filter_words({})
# ValueError: `words` must not be empty.

Additional context

  • Affected file: pyaptamer/utils/_base.py
  • Also called in: pyaptamer/aptatrans/_pipeline.py line 143
  • Fix is straightforward: add an empty check + Raises section to docstring
  • A test test_filter_words_empty_dict should be added to
    pyaptamer/utils/tests/test_base.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions