Skip to content
This repository was archived by the owner on Feb 25, 2022. It is now read-only.
This repository was archived by the owner on Feb 25, 2022. It is now read-only.

Cannot Connect To Local TPU-VM #323

@nikhilanayak

Description

@nikhilanayak

Describe the bug
When I try to connect to the TPU to finetune, it gives me this error:

Traceback (most recent call last):
  File "main.py", line 257, in <module>
    main(args)
  File "main.py", line 251, in main
    estimator.train(input_fn=partial(input_fn, global_step=current_step, eval=False), max_steps=params["train_steps"])
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3110, in train
    rendezvous.raise_errors()
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/error_handling.py", line 150, in raise_errors
    six.reraise(typ, value, traceback)
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/six.py", line 703, in reraise
    raise value
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 3100, in train
    return super(TPUEstimator, self).train(
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 346, in train
    hooks.extend(self._convert_train_steps_to_hooks(steps, max_steps))
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py", line 2973, in _convert_train_steps_to_hooks
    if ctx.is_running_on_cpu():
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_context.py", line 531, in is_running_on_cpu
    self._validate_tpu_configuration()
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_context.py", line 699, in _validate_tpu_configuration
    num_cores = self.num_cores
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_context.py", line 429, in num_cores
    metadata = self._get_tpu_system_metadata()
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow_estimator/python/estimator/tpu/tpu_context.py", line 333, in _get_tpu_system_metadata
    tpu_system_metadata_lib._query_tpu_system_metadata(
  File "/home/nikhilnayak/.local/lib/python3.8/site-packages/tensorflow/python/tpu/tpu_system_metadata.py", line 135, in _query_tpu_system_metadata
    raise RuntimeError(
RuntimeError: Cannot find any TPU cores in the system (master address ). This usually means the master address is incorrect or the TPU worker has some problems. Available devices: [_DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 268435456, -3188567715276368833)]

To Reproduce
Steps to reproduce the behavior:
I followed the instructions for finetuning on this github page.

Expected behavior
The finetuning program should finetune with my dataset without datasets.

Proposed solution
N/A

Environment (please complete the following information):

  • TPU Version: v2-alpha
  • TPU Type: v3-8
  • Architecture: TPU-VM

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions