Skip to content

RuntimeError: UNSUPPORTED DTYPE (sockeye-translate --use-cpu --dtype bfloat16) #1084

@SamuelLarkin

Description

@SamuelLarkin

Hi,
following #1083 (comment), I failed to translate using CPU and bfloat16 using pytorch-1.11.0. If I use pytorch-1.13.1, I successfully translate.

It could be something else but with those simple two tests, it looks like that pytorch-1.11.0 is not sufficient. If so, the requirements.txt should reflect that fact.

Command

python -m sockeye.translate --output-type json --batch-size 32 --models ../model --input source.en --use-cpu --dtype bfloat16

Error Message

[INFO:sockeye.utils] Sockeye: 3.1.31, commit 13c63be5e6999102cd8f76065dab618667d54c8d, path /gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3
.8/site-packages/sockeye/__init__.py
[INFO:sockeye.utils] PyTorch: 1.11.0 (/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/torch/__init__.py)
[INFO:sockeye.utils] Command: /gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py --output-type json --bat
ch-size 32 --models ../model --input source.en --use-cpu --dtype bfloat16
[INFO:sockeye.utils] Arguments: Namespace(batch_size=32, beam_search_stop='all', beam_size=5, brevity_penalty_constant_length_ratio=0.0, brevity_penalty_type='none',
brevity_penalty_weight=1.0, bucket_width=10, checkpoints=None, chunk_size=None, clamp_to_dtype=False, config=None, device_id=0, dtype='bfloat16', ensemble_mode='linear', env=None, greedy=False, input='source.en', input_factors=None, json_input=False, knn_index=None, knn_lambda=0.8, length_penalty_alpha=1.0, length_penalty_beta=0.0
, loglevel='INFO', loglevel_secondary_workers='INFO', max_input_length=None, max_output_length=None, max_output_length_num_stds=2, models=['../model'], nbest_size=1,
no_logfile=False, nvs_thresh=0.5, output=None, output_type='json', prevent_unk=False, quiet=False, quiet_secondary_workers=False, restrict_lexicon=None, restrict_lexicon_topk=None, sample=None, seed=None, skip_nvs=False, strip_unknown_words=False, tf32=True, use_cpu=True)                                                            [INFO:__main__] Translate Device: cpu                                                                                                                                 [INFO:sockeye.model] Loading 1 model(s) from ['../model'] ...
[INFO:sockeye.vocab] Vocabulary (32170 words) loaded from "../model/vocab.src.0.json"
[INFO:sockeye.vocab] Vocabulary (32170 words) loaded from "../model/vocab.trg.0.json"
[INFO:sockeye.model] Model version: 3.1.27
[INFO:sockeye.model] Loaded model config from "../model/config"
[INFO:sockeye.model] Disabling dropout layers for performance reasons
[INFO:sockeye.model] ModelConfig(config_data=DataConfig(data_statistics=DataStatistics(num_sents=18792562, num_discarded=4514, num_tokens_source=396805440, num_tokens
_target=452828161, num_unks_source=151, num_unks_target=150, max_observed_len_source=201, max_observed_len_target=201, size_vocab_source=32170, size_vocab_target=3217
0, length_ratio_mean=1.149165240213179, length_ratio_std=0.3331394866848643, buckets=[(8, 8), (16, 16), (24, 24), (32, 32), (40, 40), (48, 48), (56, 56), (64, 64), (7
2, 72), (80, 80), (88, 88), (96, 96), (104, 104), (112, 112), (120, 120), (128, 128), (136, 136), (144, 144), (152, 152), (160, 160), (168, 168), (176, 176), (184, 18
4), (192, 192), (200, 200), (201, 201)], num_sents_per_bucket=[2488902, 5093385, 3839506, 2546445, 1811528, 1189585, 737992, 443043, 261406, 152182, 89417, 52498, 310
55, 18857, 11863, 7560, 4930, 3414, 2367, 1796, 1381, 1081, 892, 762, 633, 82], average_len_target_per_bucket=[4.876037945069864, 12.773357231261594, 19.5802367533320
52, 27.41336983211651, 35.28823864603507, 43.238106587956544, 51.26582055583915, 59.238620628568825, 67.207513428303, 75.18162933867328, 83.15075619854574, 91.1199173
0707263, 99.07102421459187, 106.99094438055295, 114.94170231821508, 122.88471587159029, 130.96489273583816, 138.7659185760993, 146.50905591051549, 154.6460870395282,
162.2368031213424, 170.14799935605743, 178.83465301964753, 186.0052493281894, 193.85046425851448, 199.38909318975016], length_ratio_stats_per_bucket=[(1.0693756944877
173, 0.2734448342497526), (1.0857894201553209, 0.28019690452817625), (1.1544188404375997, 0.3868604549199259), (1.1861185841999833, 0.3074059313735606), (1.2060657841
545896, 0.2981587515456939), (1.226324650666722, 0.30493095494070144), (1.2444125341378565, 0.3242223370686047), (1.2611266481183327, 0.3709455795374948), (1.27464433
7588064, 0.4302137928506163), (1.2860484970016222, 0.48695434193951453), (1.302799569788393, 0.5653045419192184), (1.3120006329314209, 0.6142970487431451), (1.3295351
237968134, 0.7814292394252162), (1.3384637458091257, 0.8116763474141028), (1.351862242960138, 0.9642116646873813), (1.3368067683991776, 0.7653903732034699), (1.367075
2245352829, 0.9719727938185959), (1.3805636470652694, 1.0975590088160094), (1.3476927572822692, 0.696317634165507), (1.3496332871268524, 0.7573960914043955), (1.30467
33705213736, 0.6783789455528596), (1.3753328246704346, 1.6470598091351123), (1.3040674746204497, 1.059827519965373), (1.253641535651391, 0.5013375061317442), (1.24804
87830675664, 0.3590853095382778), (1.2543975958596236, 0.31963245113954747)]), max_seq_len_source=201, max_seq_len_target=201, num_source_factors=1, num_target_factor
s=1), vocab_source_size=32170, vocab_target_size=32170, config_embed_source=EmbeddingConfig(vocab_size=32170, num_embed=1024, dropout=0.0, num_factors=1, factor_confi
gs=None, allow_sparse_grad=False), config_embed_target=EmbeddingConfig(vocab_size=32170, num_embed=1024, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse
_grad=False), config_encoder=TransformerConfig(model_size=1024, attention_heads=16, feed_forward_num_hidden=4096, act_type='relu', num_layers=6, dropout_attention=0.0
, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=201, max_seq_len_tar
get=201, decoder_type='transformer', use_lhuc=False, depth_key_value=1024, use_glu=False), config_decoder=TransformerConfig(model_size=1024, attention_heads=16, feed_
forward_num_hidden=4096, act_type='relu', num_layers=6, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_seq
uence='n', postprocess_sequence='dr', max_seq_len_source=201, max_seq_len_target=201, decoder_type='transformer', use_lhuc=False, depth_key_value=1024, use_glu=False)
, config_length_task=None, weight_tying_type='src_trg_softmax', lhuc=False, dtype='float32', neural_vocab_selection=None, neural_vocab_selection_block_loss=False)
[INFO:sockeye.model] Loaded params from "../model/params.best" to "cpu"
[INFO:sockeye.model] Casting SockeyeModel to dtype torch.bfloat16
[INFO:sockeye.model] Model dtype: overridden to bfloat16
[INFO:sockeye.model] 1 model(s) loaded in 7.1540s
[INFO:sockeye.inference] Translator (1 model(s) beam_size=5 algorithm=BeamSearch, beam_search_stop=all max_input_length=200 nbest_size=1 ensemble_mode=None max_batch_size=32 dtype=torch.bfloat16 skip_nvs=False nvs_thresh=0.5)
[INFO:__main__] Translating...
/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/torch/jit/_trace.py:958: TracerWarning: Encountering a list at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module's inputs. Consider using a constant container instead (e.g. for `list`, use a `tuple` instead. for `dict`, use a `NamedTuple` instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.
  module._c._create_method_from_trace(
[ERROR:root] Uncaught exception
Traceback (most recent call last):
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 264, in <module>
    main()
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 42, in main
    run_translate(args)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 146, in run_translate
    read_and_translate(translator=translator,
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 232, in read_and_translate
    chunk_time = translate(output_handler, chunk, translator)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/translate.py", line 255, in translate
    trans_outputs = translator.translate(trans_inputs)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/inference.py", line 943, in translate
    batch_translations = self._translate_np(*self._get_inference_input(translator_inputs))
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/inference.py", line 1184, in _translate_np
    return self._get_best_translations(self._search(source,
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/sockeye/beam_search.py", line 1047, in forward
    lengths, estimated_reference_lengths = self._traced_sort_norm_and_update_finished(*_sort_inputs)
  File "/gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
RuntimeError: UNSUPPORTED DTYPE

Conda Env Export

name: sockeye-3.1.31
channels:
  - pytorch
  - nvidia
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=5.1=1_gnu
  - blas=1.0=mkl
  - bzip2=1.0.8=h7b6447c_0
  - ca-certificates=2023.01.10=h06a4308_0
  - certifi=2022.12.7=py310h06a4308_0
  - cuda=11.6.1=0
  - cuda-cccl=11.6.55=hf6102b2_0
  - cuda-command-line-tools=11.6.2=0
  - cuda-compiler=11.6.2=0
  - cuda-cudart=11.6.55=he381448_0
  - cuda-cudart-dev=11.6.55=h42ad0f4_0
  - cuda-cuobjdump=11.6.124=h2eeebcb_0
  - cuda-cupti=11.6.124=h86345e5_0
  - cuda-cuxxfilt=11.6.124=hecbf4f6_0
  - cuda-driver-dev=11.6.55=0
  - cuda-gdb=12.0.90=0
  - cuda-libraries=11.6.1=0
  - cuda-libraries-dev=11.6.1=0
  - cuda-memcheck=11.8.86=0
  - cuda-nsight=12.0.78=0
  - cuda-nsight-compute=12.0.0=0
  - cuda-nvcc=11.6.124=hbba6d2d_0
  - cuda-nvdisasm=12.0.76=0
  - cuda-nvml-dev=11.6.55=haa9ef22_0
  - cuda-nvprof=12.0.90=0
  - cuda-nvprune=11.6.124=he22ec0a_0
  - cuda-nvrtc=11.6.124=h020bade_0
  - cuda-nvrtc-dev=11.6.124=h249d397_0
  - cuda-nvtx=11.6.124=h0630a44_0
  - cuda-nvvp=12.0.90=0
  - cuda-runtime=11.6.1=0
  - cuda-samples=11.6.101=h8efea70_0
  - cuda-sanitizer-api=12.0.90=0
  - cuda-toolkit=11.6.1=0
  - cuda-tools=11.6.1=0
  - cuda-visual-tools=11.6.1=0
  - flit-core=3.6.0=pyhd3eb1b0_0
  - gds-tools=1.5.0.59=0
  - intel-openmp=2022.1.0=h9e868ea_3769
  - ld_impl_linux-64=2.38=h1181459_1
  - libcublas=11.9.2.110=h5e84587_0
  - libcublas-dev=11.9.2.110=h5c901ab_0
  - libcufft=10.7.1.112=hf425ae0_0
  - libcufft-dev=10.7.1.112=ha5ce4c0_0
  - libcufile=1.5.0.59=0
  - libcufile-dev=1.5.0.59=0
  - libcurand=10.3.1.50=0
  - libcurand-dev=10.3.1.50=0
  - libcusolver=11.3.4.124=h33c3c4e_0
  - libcusparse=11.7.2.124=h7538f96_0
  - libcusparse-dev=11.7.2.124=hbbe9722_0
  - libffi=3.4.2=h6a678d5_6
  - libgcc-ng=11.2.0=h1234567_1
  - libgomp=11.2.0=h1234567_1
  - libnpp=11.6.3.124=hd2722f0_0
  - libnpp-dev=11.6.3.124=h3c42840_0
  - libnvjpeg=11.6.2.124=hd473ad6_0
  - libnvjpeg-dev=11.6.2.124=hb5906b9_0
  - libstdcxx-ng=11.2.0=h1234567_1
  - libuuid=1.41.5=h5eee18b_0
  - mkl=2022.1.0=hc2b9512_224
  - ncurses=6.3=h5eee18b_3
  - nsight-compute=2022.4.0.15=0
  - openssl=1.1.1s=h7f8727e_0
  - pip=22.3.1=py310h06a4308_0
  - python=3.10.9=h7a1cb2a_0
  - pytorch=1.13.1=py3.10_cuda11.6_cudnn8.3.2_0
  - pytorch-cuda=11.6=h867d48c_1
  - pytorch-mutex=1.0=cuda
  - readline=8.2=h5eee18b_0
  - setuptools=65.6.3=py310h06a4308_0
  - sqlite=3.40.1=h5082296_0
  - tk=8.6.12=h1ccaba5_0
  - typing_extensions=4.4.0=py310h06a4308_0
  - tzdata=2022g=h04d1e81_0
  - wheel=0.37.1=pyhd3eb1b0_0
  - xz=5.2.10=h5eee18b_1
  - zlib=1.2.13=h5eee18b_0
  - pip:
      - aiohttp==3.8.3
      - aiosignal==1.3.1
      - async-timeout==4.0.2
      - attrs==22.2.0
      - charset-normalizer==2.1.1
      - codetiming==1.4.0
      - colorama==0.4.6
      - datasets==2.8.0
      - dill==0.3.6
      - filelock==3.9.0
      - frozenlist==1.3.3
      - fsspec==2023.1.0
      - huggingface-hub==0.11.1
      - idna==3.4
      - joblib==1.2.0
      - lxml==4.9.2
      - multidict==6.0.4
      - multiprocess==0.70.14
      - numpy==1.24.1
      - packaging==23.0
      - pandas==1.5.3
      - portalocker==2.7.0
      - py-spy==0.3.14
      - pyarrow==10.0.1
      - python-dateutil==2.8.2
      - pytz==2022.7.1
      - pyyaml==6.0
      - regex==2022.10.31
      - requests==2.28.2
      - responses==0.18.0
      - sacrebleu==2.3.1
      - scikit-learn==1.2.0
      - scipy==1.10.0
      - six==1.16.0
      - sockeye==3.1.31
      - tabulate==0.9.0
      - threadpoolctl==3.1.0
      - tokenizers==0.12.1
      - tqdm==4.64.1
      - transformers==4.20.1
      - urllib3==1.26.14
      - xxhash==3.2.0
      - yarl==1.8.2
prefix: /gpfs/projects/DT/mtp/WMT20/opt/miniconda3/envs/sockeye-3.1.31

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions