Turning off XLA? Getting error messages after fresh install (recently reinstalled Ubuntu OS) #899

Chaztikov · 2022-09-08T15:55:45Z

Chaztikov
Sep 8, 2022

Hi, I'm not sure why the error messages follow, I very recently installed Ubuntu 22.04 along with python 3.10, tensorflow, torch, deepxde. Torch and tensorflow seem to work and identify cuda/my gpu.

The script that I try to run below worked on my previous installation. Any idea on what this output is indicating? Seems to identify some issue with deepxde finding the gpu or not. Thanks.

chaztikov@priority:~/git/PINNs/main/examples/IBPINN/projects/forward_models/FOSLS/LinearElasticityFOSLS/HE_neohookean_planestress$ python3 singlerun_neohookean_hyperelastic_FOSLS_2D.py
Using backend: tensorflow.compat.v1

WARNING:tensorflow:From /home/chaztikov/.local/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Enable just-in-time compilation with XLA.

WARNING:tensorflow:From /home/chaztikov/.local/lib/python3.10/site-packages/deepxde/nn/initializers.py:118: The name tf.keras.initializers.he_normal is deprecated. Please use tf.compat.v1.keras.initializers.he_normal instead.

Set the default float type to float32
current_file_name
singlerun_neohookean_hyperelastic_FOSLS_2D
/home/chaztikov/git/PINNs/main/examples/IBPINN/projects/forward_models/FOSLS/LinearElasticityFOSLS/HE_neohookean_planestress/saved/
81 28 32
289 60 64
1024 124 124
3969 248 248
15876 500 500
num_points,num_boundary_points 1681 160
/home/chaztikov/git/PINNs/main/examples/IBPINN/projects/forward_models/FOSLS/LinearElasticityFOSLS/HE_neohookean_planestress/saved/
Warning: 10 points required, but 16 points sampled.
mexclusions
[]
Compiling model...
Building feed-forward neural network...
/home/chaztikov/.local/lib/python3.10/site-packages/deepxde/nn/tensorflow_compat_v1/fnn.py:103: UserWarning: tf.layers.dense is deprecated and will be removed in a future version. Please use tf.keras.layers.Dense instead.
return tf.layers.dense(
'build' took 0.281838 s

2022-09-08 10:46:34.216905: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-08 10:46:34.765001: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2022-09-08 10:46:34.765048: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9921 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1
'compile' took 2.370059 s

Warning: epochs is deprecated and will be removed in a future version. Use iterations instead.
Initializing variables...
2022-09-08 10:46:36.332096: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
Training model...

2022-09-08 10:46:36.637194: I tensorflow/compiler/xla/service/service.cc:170] XLA service 0x7fa368009950 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-09-08 10:46:36.637232: I tensorflow/compiler/xla/service/service.cc:178] StreamExecutor device (0): NVIDIA GeForce GTX 1080 Ti, Compute Capability 6.1
2022-09-08 10:46:36.662312: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:263] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY to enable.
2022-09-08 10:46:37.251477: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 10:46:37.252078: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 10:46:37.252130: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2022-09-08 10:46:37.253146: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 10:46:37.253219: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
This message will be only logged once.
2022-09-08 10:46:37.255705: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 10:46:37.255731: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2022-09-08 10:46:37.256383: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 10:46:37.256444: W tensorflow/compiler/xla/service/gpu/buffer_comparator.cc:640] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation.
Setting XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda or modifying $PATH can be used to set the location of ptxas
This message will only be logged once.
2022-09-08 10:46:37.387348: W tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
./cuda_sdk_lib
/usr/local/cuda-11.2
/usr/local/cuda
.
You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2022-09-08 10:46:37.388380: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:330] libdevice is required by this HLO module but was not found at ./libdevice.10.bc
2022-09-08 10:46:37.391976: I tensorflow/compiler/jit/xla_compilation_cache.cc:478] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
2022-09-08 10:46:37.392913: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at xla_ops.cc:462 : INTERNAL: libdevice not found at ./libdevice.10.bc
Traceback (most recent call last):
File "/home/chaztikov/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1377, in _do_call
return fn(*args)
File "/home/chaztikov/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1360, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/home/chaztikov/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1453, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) INTERNAL: libdevice not found at ./libdevice.10.bc
[[{{node cluster_0_1/xla_compile}}]]
[[cluster_0_1/merge_oidx_1/_3]]
(1) INTERNAL: libdevice not found at ./libdevice.10.bc
[[{{node cluster_0_1/xla_compile}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/chaztikov/git/PINNs/main/examples/IBPINN/projects/forward_models/FOSLS/LinearElasticityFOSLS/HE_neohookean_planestress/singlerun_neohookean_hyperelastic_FOSLS_2D.py", line 3190, in
losshistory, train_state = model.train(epochs=mepochs + 1, batch_size=mbatch_size,display_every=mdisp, disregard_previous_best=mdisregard_previous_best, callbacks=mcallback_list, model_save_path=save_path)
File "/home/chaztikov/.local/lib/python3.10/site-packages/deepxde/utils/internal.py", line 22, in wrapper
result = f(*args, **kwargs)
File "/home/chaztikov/.local/lib/python3.10/site-packages/deepxde/model.py", line 561, in train
self._test()
File "/home/chaztikov/.local/lib/python3.10/site-packages/deepxde/model.py", line 693, in _test
) = self._outputs_losses(
File "/home/chaztikov/.local/lib/python3.10/site-packages/deepxde/model.py", line 471, in _outputs_losses
return self.sess.run(outputs_losses, feed_dict=feed_dict)
File "/home/chaztikov/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/home/chaztikov/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1190, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/home/chaztikov/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1370, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/home/chaztikov/.local/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1396, in _do_call
raise type(e)(node_def, op, message) # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.InternalError: Graph execution error:

2 root error(s) found.
(0) INTERNAL: libdevice not found at ./libdevice.10.bc
[[{{node cluster_0_1/xla_compile}}]]
[[cluster_0_1/merge_oidx_1/_3]]
(1) INTERNAL: libdevice not found at ./libdevice.10.bc
[[{{node cluster_0_1/xla_compile}}]]
0 successful operations.
0 derived errors ignored.

Chaztikov · 2022-09-08T16:43:21Z

Chaztikov
Sep 8, 2022
Author

I tried changing the backend (not sure why the default was "tensorflow_compat_v1")

(base) chaztikov@priority:~$ python3
Python 3.9.12 (main, Apr 5 2022, 06:56:58)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

import deepxde as dde
Using backend: tensorflow.compat.v1

WARNING:tensorflow:From /home/chaztikov/anaconda3/lib/python3.9/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Enable just-in-time compilation with XLA.

WARNING:tensorflow:From /home/chaztikov/anaconda3/lib/python3.9/site-packages/deepxde/nn/initializers.py:118: The name tf.keras.initializers.he_normal is deprecated. Please use tf.compat.v1.keras.initializers.he_normal instead.

dde.backend.s
dde.backend.selu( dde.backend.sin(
dde.backend.set_default_backend( dde.backend.square(
dde.backend.shape( dde.backend.sum(
dde.backend.sigmoid( dde.backend.sys
dde.backend.silu(
dde.backend.set_default_backend("tensorflow")

0 replies

Chaztikov · 2022-09-08T16:45:04Z

Chaztikov
Sep 8, 2022
Author

Changing the backend to "tensorflow" did not seem to resolve my issue
I will see if the examples work

Using backend: tensorflow

Enable just-in-time compilation with XLA.

Set the default float type to float64
current_file_name
singlerun_neohookean_hyperelastic_FOSLS_2D_tf2
/home/chaztikov/git/PINNs/main/examples/IBPINN/projects/forward_models/FOSLS/LinearElasticityFOSLS/HE_neohookean_planestress/saved/
81 28 32
289 60 64
1024 124 124
3969 248 248
15876 500 500
num_points,num_boundary_points 1681 160
/home/chaztikov/git/PINNs/main/examples/IBPINN/projects/forward_models/FOSLS/LinearElasticityFOSLS/HE_neohookean_planestress/saved/
Warning: 10 points required, but 16 points sampled.
2022-09-08 12:44:15.458984: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-08 12:44:16.000539: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2022-09-08 12:44:16.000595: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10018 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1
mexclusions
[]
Compiling model...
'compile' took 0.000376 s

Warning: epochs is deprecated and will be removed in a future version. Use iterations instead.
Training model...

2022-09-08 12:44:19.775379: I tensorflow/compiler/xla/service/service.cc:170] XLA service 0x564f3badd8a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-09-08 12:44:19.775415: I tensorflow/compiler/xla/service/service.cc:178] StreamExecutor device (0): NVIDIA GeForce GTX 1080 Ti, Compute Capability 6.1
2022-09-08 12:44:19.838850: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:263] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY to enable.
2022-09-08 12:44:21.142977: W tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
./cuda_sdk_lib
/usr/local/cuda-11.2
/usr/local/cuda
.
You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2022-09-08 12:44:21.144031: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:330] libdevice is required by this HLO module but was not found at ./libdevice.10.bc
2022-09-08 12:44:21.147951: I tensorflow/compiler/jit/xla_compilation_cache.cc:478] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
2022-09-08 12:44:21.150484: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at xla_ops.cc:296 : INTERNAL: libdevice not found at ./libdevice.10.bc
Traceback (most recent call last):
File "/home/chaztikov/git/PINNs/main/examples/IBPINN/projects/forward_models/FOSLS/LinearElasticityFOSLS/HE_neohookean_planestress/singlerun_neohookean_hyperelastic_FOSLS_2D_tf2.py", line 3190, in
losshistory, train_state = model.train(epochs=mepochs + 1, batch_size=mbatch_size,display_every=mdisp, disregard_previous_best=mdisregard_previous_best, callbacks=mcallback_list, model_save_path=save_path)
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/deepxde/utils/internal.py", line 22, in wrapper
result = f(*args, **kwargs)
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/deepxde/model.py", line 561, in train
self._test()
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/deepxde/model.py", line 693, in _test
) = self._outputs_losses(
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/deepxde/model.py", line 473, in _outputs_losses
outs = outputs_losses(inputs, targets, auxiliary_vars)
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: libdevice not found at ./libdevice.10.bc [Op:__inference_outputs_losses_train_2499]
[Finished in 10.5s]

0 replies

Chaztikov · 2022-09-08T16:46:29Z

Chaztikov
Sep 8, 2022
Author

I tried the example

Burgers_RAR.py

This did not run to completion, I am not sure what is wrong, here is the output from that example

Using backend: tensorflow

Enable just-in-time compilation with XLA.

2022-09-08 12:45:38.861163: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-08 12:45:39.412512: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2022-09-08 12:45:39.412581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9985 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1
Compiling model...
'compile' took 0.000645 s

Training model...

WARNING:tensorflow:AutoGraph could not transform <function at 0x7f3b7e31fb80> and will run it as-is.
Cause: could not parse the source code of <function at 0x7f3b7e31fb80>: no matching AST found among candidates:

coding=utf-8

lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))])
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function at 0x7f3b7e31fdc0> and will run it as-is.
Cause: could not parse the source code of <function at 0x7f3b7e31fdc0>: no matching AST found among candidates:

coding=utf-8

lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))])
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
2022-09-08 12:45:41.046812: I tensorflow/compiler/xla/service/service.cc:170] XLA service 0x55bde895b9c0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-09-08 12:45:41.046843: I tensorflow/compiler/xla/service/service.cc:178] StreamExecutor device (0): NVIDIA GeForce GTX 1080 Ti, Compute Capability 6.1
2022-09-08 12:45:41.059139: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:263] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY to enable.
2022-09-08 12:45:41.624323: W tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
./cuda_sdk_lib
/usr/local/cuda-11.2
/usr/local/cuda
.
You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2022-09-08 12:45:41.916009: I tensorflow/compiler/jit/xla_compilation_cache.cc:478] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
Step Train loss Test loss Test metric
0 [2.20e-02, 2.59e-02, 4.29e-01] [2.20e-02, 2.59e-02, 4.29e-01] []
2022-09-08 12:45:42.934218: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:330] libdevice is required by this HLO module but was not found at ./libdevice.10.bc
2022-09-08 12:45:42.937782: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at xla_ops.cc:296 : INTERNAL: libdevice not found at ./libdevice.10.bc
Traceback (most recent call last):
File "/home/chaztikov/git/deepxde/examples/pinn_forward/Burgers_RAR.py", line 38, in
model.train(iterations=10000)
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/deepxde/utils/internal.py", line 22, in wrapper
result = f(*args, **kwargs)
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/deepxde/model.py", line 573, in train
self._train_sgd(iterations, display_every)
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/deepxde/model.py", line 590, in _train_sgd
self._train_step(
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/deepxde/model.py", line 491, in _train_step
self.train_step(inputs, targets, auxiliary_vars)
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/chaztikov/anaconda3/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: libdevice not found at ./libdevice.10.bc [Op:__inference_train_step_1221]
[Finished in 8.4s]

0 replies

Chaztikov · 2022-09-08T17:30:05Z

Chaztikov
Sep 8, 2022
Author

If I deactivate conda and remove it and associated lines from bashrc, I get a different error for the Burgers_RAR example

chaztikov@priority:~/git/deepxde/examples/pinn_forward$ python3 Burgers_RAR.py
Using backend: tensorflow

Enable just-in-time compilation with XLA.

2022-09-08 13:29:07.863662: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-08 13:29:08.565378: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2022-09-08 13:29:08.565437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9800 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1
Compiling model...
'compile' took 0.000428 s

Training model...

WARNING:tensorflow:AutoGraph could not transform <function at 0x7f63ec563250> and will run it as-is.
Cause: could not parse the source code of <function at 0x7f63ec563250>: no matching AST found among candidates:

coding=utf-8

lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))])
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function at 0x7f63ec563490> and will run it as-is.
Cause: could not parse the source code of <function at 0x7f63ec563490>: no matching AST found among candidates:

coding=utf-8

lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))])
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
2022-09-08 13:29:10.141606: I tensorflow/compiler/xla/service/service.cc:170] XLA service 0x56496ae81400 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-09-08 13:29:10.141650: I tensorflow/compiler/xla/service/service.cc:178] StreamExecutor device (0): NVIDIA GeForce GTX 1080 Ti, Compute Capability 6.1
2022-09-08 13:29:10.169979: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:263] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY to enable.
2022-09-08 13:29:10.802734: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 13:29:10.803682: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 13:29:10.803711: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2022-09-08 13:29:10.804477: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 13:29:10.804548: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
This message will be only logged once.
2022-09-08 13:29:10.808809: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 13:29:10.808838: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2022-09-08 13:29:10.809669: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 13:29:10.809744: W tensorflow/compiler/xla/service/gpu/buffer_comparator.cc:640] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation.
Setting XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda or modifying $PATH can be used to set the location of ptxas
This message will only be logged once.
2022-09-08 13:29:10.905538: W tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
./cuda_sdk_lib
/usr/local/cuda-11.2
/usr/local/cuda
.
You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2022-09-08 13:29:11.088883: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 13:29:11.088926: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version
2022-09-08 13:29:11.089862: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-08 13:29:11.090310: F tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:456] ptxas returned an error during compilation of ptx to sass: 'INTERNAL: Failed to launch ptxas' If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided.

0 replies

Chaztikov · 2022-09-09T14:19:57Z

Chaztikov
Sep 9, 2022
Author

Still not sure what's wrong here, any advice is appreciated. It could be something very obvious. I didn't have this trouble on my last install, with Ubuntu 18.04 and python3.6.

0 replies

Chaztikov · 2022-09-11T03:12:04Z

Chaztikov
Sep 11, 2022
Author

Is there a more appropriate place to ask this question? Perhaps the tensorflow repo?

1 reply

Chaztikov Sep 11, 2022
Author

Also seeing
W tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice.

lululxvi · 2022-09-15T18:32:16Z

lululxvi
Sep 15, 2022
Maintainer

It should be due to some installation/configuration issue of tensorflow. You can try disable XLA: https://deepxde.readthedocs.io/en/latest/modules/deepxde.html#deepxde.config.disable_xla_jit

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Turning off XLA? Getting error messages after fresh install (recently reinstalled Ubuntu OS) #899

Uh oh!

{{title}}

Uh oh!

Replies: 8 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Turning off XLA? Getting error messages after fresh install (recently reinstalled Ubuntu OS) #899

Uh oh!

Chaztikov Sep 8, 2022

Replies: 8 comments · 1 reply

Uh oh!

Chaztikov Sep 8, 2022 Author

Uh oh!

Chaztikov Sep 8, 2022 Author

Uh oh!

Chaztikov Sep 8, 2022 Author

coding=utf-8

coding=utf-8

Uh oh!

Chaztikov Sep 8, 2022 Author

coding=utf-8

coding=utf-8

Uh oh!

Uh oh!

Chaztikov Sep 9, 2022 Author

Uh oh!

Chaztikov Sep 11, 2022 Author

Uh oh!

Chaztikov Sep 11, 2022 Author

Uh oh!

lululxvi Sep 15, 2022 Maintainer

Chaztikov
Sep 8, 2022

Replies: 8 comments 1 reply

Chaztikov
Sep 8, 2022
Author

Chaztikov
Sep 8, 2022
Author

Chaztikov
Sep 8, 2022
Author

Chaztikov
Sep 8, 2022
Author

Chaztikov
Sep 9, 2022
Author

Chaztikov
Sep 11, 2022
Author

Chaztikov Sep 11, 2022
Author

lululxvi
Sep 15, 2022
Maintainer