Skip to content

Notes for variable_scope, name_scope, and weight sharing #43

Open
@taehoonlee

Description

@taehoonlee

TensorNets provides a seamless integration with regular TensorFlow APIs. You can define any models under tf.variable_scope and tf.name_scope to couple the model with your established scripts. This document shows basic examples for tf.variable_scope, tf.name_scope, and weight sharing. First, import two libraries:

import tensorflow as tf
import tensornets as nets

Let's get started with basic TensorFlow APIs. You can manage a prefix of variable names with tf.variable_scope and tf.name_scope. The difference is that tf.Variable will be affected by only tf.name_scope, while tf.Tensor by both tf.variable_scope and tf.name_scope. Also, the second tf.get_variable('w', [1]) will try to create the same variable if tf.variable_scope(reuse=None), or return the pointer of the existing variable otherwise (reuse=True, reuse=tf.AUTO_REUSE). Here is an example:

with tf.name_scope('foo'):
  with tf.variable_scope('goo'):
    with tf.name_scope('hoo'):

      # `tf.Variable` will be affected by only `tf.name_scope`.
      w = tf.get_variable('w', [1])
      assert w.name == 'goo/w:0'

      # `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
      s = tf.constant(-1.0)
      y = s * w
      assert s.name == 'foo/goo/hoo/Const:0'
      assert y.name == 'foo/goo/hoo/mul:0'

      # `tf.get_variable` will try to create the same variable again
      # if `tf.variable_scope(reuse=None)` (default).
      try:
        w2 = tf.get_variable('w', [1])
      except ValueError as e:
        print(e)  # Variable goo/w already exists, disallowed.

The principle is easily extended to TensorNets. The weights returned by get_weights are tf.Variable, and the outputs from get_outputs and get_middles are tf.Tensor. Thus, the weights will be affected by only tf.name_scope, while the outputs and the middles by both tf.variable_scope and tf.name_scope. Surely, the model function call can't be performed without reuse=True or reuse=tf.AUTO_REUSE because the function will try to create the same variable again.

with tf.name_scope('xoo'):
  with tf.variable_scope('yoo'):
    with tf.name_scope('zoo'):

      # The weights returned by `get_weights` are `tf.Variable`,
      # and the outputs from `get_outputs` and `get_middles` are `tf.Tensor`
      x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
      model1 = nets.ResNet50(x1)

      # `tf.Variable` will be affected by only `tf.name_scope`.
      assert model1.get_weights()[-1].name == 'yoo/resnet50/logits/biases:0'

      # `tf.Tensor` will be affected by both `tf.variable_scope` and `tf.name_scope`.
      assert model1.name == 'xoo/yoo/zoo/resnet50/probs:0'
      assert model1.get_outputs()[-1].name == 'xoo/yoo/zoo/resnet50/probs:0'
      assert model1.get_middles()[-1].name == 'xoo/yoo/zoo/resnet50/conv5/block3/out:0'

      # `tf.get_variable` will try to create the same variable again
      # if `tf.variable_scope(reuse=None)` (default).
      try:
        x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
        model2 = nets.ResNet50(x2)
      except ValueError as e:
        print(e)  # Variable yoo/resnet50/conv1/conv/weights already exists, disallowed.

And we can easily implement the concept of weight sharing by using tf.variable_scope(reuse=tf.AUTO_REUSE). An example is as follows:

with tf.variable_scope('boo', reuse=tf.AUTO_REUSE):
  w1 = tf.get_variable('w', [1])
  w2 = tf.get_variable('w', [1])
  assert w1 == w2
  assert w1.name == 'boo/w:0'
  s = tf.constant(-1.0)
  y1 = s * w1
  y2 = s * w2
  assert y1 != y2
  assert y1.name == 'boo/mul:0'
  assert y2.name == 'boo/mul_1:0'

TensorNets can be also easily integrated with tf.variable_scope:

with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
  x1 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x1')
  x2 = tf.placeholder(tf.float32, [None, 224, 224, 3], name='x2')
  model1 = nets.ResNet50(x1)
  model2 = nets.ResNet50(x2)
  for (a, b) in zip(model1.get_weights(), model2.get_weights()):
    assert a == b
  assert model1.get_weights()[-1].name == 'koo/resnet50/logits/biases:0'
  assert model1 != model2
  assert model1.name == 'koo/resnet50/probs:0'
  assert model2.name == 'koo/resnet50_1/probs:0'

Summary

I'd like to say that there are two patterns to implement weight sharing:

  1. with tf.variable_scope:
with tf.variable_scope('koo', reuse=tf.AUTO_REUSE):
    model1 = nets.ResNet50(x1)
    model2 = nets.ResNet50(x2)
  1. without variable_scope:
model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)
  1. (equivalent to 2):
import functools
resnet = functools.partial(nets.ResNet50, reuse=tf.AUTO_REUSE)
model1 = resnet(x1)
model2 = resnet(x2)

And I recommend the following pattern used in deploying multiple clones in tf.slim:

with tf.name_scope('clone0'):
    model1 = nets.ResNet50(x1, reuse=tf.AUTO_REUSE)
with tf.name_scope('clone1'):
    model2 = nets.ResNet50(x2, reuse=tf.AUTO_REUSE)

for (a, b) in zip(model1.get_weights(), model2.get_weights()):
    assert a == b

assert model1.name == 'clone0/resnet50/probs:0'
assert model2.name == 'clone1/resnet50/probs:0'

Without tf.name_scope, tf.Tensor will be automatically named with a postfix-style (resnet50, resnet50_1, ...). I think that it may be difficult to manage tensor names in such cases.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions