Accessing Document vectors and computing similarity is too slow without batching

Hi,

I use this library and other Spacy models to create Doc objects. 
I use the pipe() method to apply this to a large corpus of text. The main challenge is that accessing the vector of each document is too slow. 
Is there a way to get only the vector from applying the model on the text? Or extract the vectors in batches as well?
This problem was raised from the problem of similarity where I couldn't also use the method similarity() on batches but only 1 by 1. Is there a way to compute similarity in batches?

I'm using a 4-core CPU.

I hope my question is clear.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Accessing Document vectors and computing similarity is too slow without batching #18

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Accessing Document vectors and computing similarity is too slow without batching #18

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions