This repository was archived by the owner on Feb 25, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 965
This repository was archived by the owner on Feb 25, 2022. It is now read-only.
Cannot Connect To Local TPU-VM #323
Copy link
Copy link
Open
Labels
bugSomething isn't working.Something isn't working.
Description
Describe the bug
When I try to connect to the TPU to finetune, it gives me this error:
Traceback (most recent call last):
File "main.py", line 257, in <module>
main(args)
File "main.py", line 251, in main
estimator.train(input_fn=partial(input_fn, global_step=current_step, eval=False), max_steps=params["train_steps"])
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3110, in train
rendezvous.raise_errors()
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 150, in raise_errors
six.reraise(typ, value, traceback)
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/six.py", line 703, in reraise
raise value
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3100, in train
return super(TPUEstimator, self).train(
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 346, in train
hooks.extend(self._convert_train_steps_to_hooks(steps, max_steps))
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2973, in _convert_train_steps_to_hooks
if ctx.is_running_on_cpu():
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_context.py", line 531, in is_running_on_cpu
self._validate_tpu_configuration()
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_context.py", line 699, in _validate_tpu_configuration
num_cores = self.num_cores
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_context.py", line 429, in num_cores
metadata = self._get_tpu_system_metadata()
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_context.py", line 333, in _get_tpu_system_metadata
tpu_system_metadata_lib._query_tpu_system_metadata(
File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow/python/tpu/tpu_system_metadata.py", line 135, in _query_tpu_system_metadata
raise RuntimeError(
RuntimeError: Cannot find any TPU cores in the system (master address ). This usually means the master address is incorrect or the TPU worker has some problems. Available devices: [_DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456, -3188567715276368833)]
To Reproduce
Steps to reproduce the behavior:
I followed the instructions for finetuning on this github page.
Expected behavior
The finetuning program should finetune with my dataset without datasets.
Proposed solution
N/A
Environment (please complete the following information):
- TPU Version: v2-alpha
- TPU Type: v3-8
- Architecture: TPU-VM
Metadata
Metadata
Assignees
Labels
bugSomething isn't working.Something isn't working.