Open
Description
Context
- Pytorch version: 1.13.0
- Operating System and version: Ubuntu 20.04.6 LTS
Expected Behavior
The README in the examples/language_translation
directory states that any language can be used for translation and specifically mentions using English to French translation (python3 main.py --src en --tgt fr
).
Current Behavior
However, when attempting to run the training on translation with English to French, the program throws an AssertionError
indicating that the language pair must be either ('de', 'en') or ('en', 'de').
Steps to Reproduce
- Run the command:
python3 -m spacy download fr
- Run the training command:
python3 main.py --src en --tgt fr
- Observe the AssertionError indicating the supported language pairs are only English and German.
...
Failure Logs [if any]
$ python3 main.py --src en --tgt fr
python3 main.py --src en --tgt fr
Translation task: en -> fr
Using device: cpu
Traceback (most recent call last):
File "main.py", line 306, in <module>
main(args)
File "main.py", line 201, in main
train_dl, valid_dl, src_vocab, tgt_vocab, _, _, special_symbols = get_data(opts)
File "....../language_translation/src/data.py", line 36, in get_data
train_iterator = Multi30k(split="train", language_pair=(src_lang, tgt_lang))
File "....../anaconda3/envs/torch_v1_13_0/lib/python3.8/site-packages/torchtext/data/datasets_utils.py", line 193, in wrapper
return fn(root=new_root, *args, **kwargs)
File "....../anaconda3/envs/torch_v1_13_0/lib/python3.8/site-packages/torchtext/data/datasets_utils.py", line 155, in new_fn
result.append(fn(root, item, **kwargs))
File "....../anaconda3/envs/torch_v1_13_0/lib/python3.8/site-packages/torchtext/datasets/multi30k.py", line 83, in Multi30k
assert tuple(sorted(language_pair)) == (
AssertionError: language_pair must be either ('de','en') or ('en', 'de')
Metadata
Assignees
Labels
No labels