Hi @lea-33 ,
I was just wondering how we could potentially speed up the DALIA export. Do I presume right, that the slow part of the conversion is related to that part?
from transformers import pipeline
model_ckpt = "papluca/xlm-roberta-base-language-detection"
pipe = pipeline("text-classification", model=model_ckpt)
...
Do you think it would be possible to move this part to a separate notebook, that we can run manually from time to time? It could add the language to entries in our yml file where language is not yet defined. In this way, we can cache language, and do not have to run this again and again.
Let me know what you think!
Hi @lea-33 ,
I was just wondering how we could potentially speed up the DALIA export. Do I presume right, that the slow part of the conversion is related to that part?
Do you think it would be possible to move this part to a separate notebook, that we can run manually from time to time? It could add the
languageto entries in our yml file where language is not yet defined. In this way, we can cache language, and do not have to run this again and again.Let me know what you think!