Skip to content

filter_pos bug #13

@aflyax

Description

@aflyax

Describe the bug
When trying to run ops.text.clean.filter_pos("NOUN", keep_matching_tokens=True), getting: module 'jange.ops.text.clean' has no attribute 'filter_pos'.

When changed to ops.text.clean.pos_filter("NOUN", keep_matching_tokens=True), getting: OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

To Reproduce

From examples:

clusters_ds = ds.apply(
    ops.text.clean.pos_filter("NOUN", keep_matching_tokens=True),
    ops.text.encode.tfidf(max_features=5000, name="tfidf"),
    ops.cluster.minibatch_kmeans(n_clusters=5),
    result_collector=result_collector,
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions