Error caused by SigmoidFocalCrossEntropy with kernel regularizer

**System information**
- OS: Linux Ubuntu 16.04:
- TensorFlow: tensorflow-gpu 2.2.0 installed via Anaconda (`conda install`), binary
- (Anaconda repository currently does not support a newer TensorFlow)
- TensorFlow-Addons: tensorflow-addons 0.11.2 via pip (`pip install tensorflow-addons==0.11.2`)
- pip was installed in this conda environment; newer `tfa` requires newer `tf`
- Python version: 3.8
- Is GPU used? (yes/no): yes
- using `tf.keras`, not standalone `keras`

**Describe the bug**
I have L2 kernel regularizer set for some of the (keras) layers.
`tfa.losses.SigmoidFocalCrossEntropy()` was used as the loss function.
After the model being built and compiled, `model.fit` was called and the following exception occurred:

> ValueError: Shapes must be equal rank, but are 1 and 0
>    	From merging shape 0 with other shapes. for '{{node AddN}} = AddN[N=2, T=DT_FLOAT](sigmoid_focal_crossentropy/weighted_loss/Mul, d1_7/kernel/Regularizer/add)' with input shapes: [?], [].

The full stack trace is too long and would be appended at the tail.

**Code to reproduce the issue**

Run the following code:

```python
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense
import tensorflow_addons as tfa

model = keras.Sequential([
    Dense(5, activation='relu', kernel_regularizer='l2', name='d1', input_shape=(12,)),
    Dense(5, activation='softmax', name='dout')
])
model.compile(optimizer='adam', loss=tfa.losses.SigmoidFocalCrossEntropy(), metrics=['accuracy'])
model.summary()
# random data with desired shape was used to help with faster reproduction
model.fit(np.random.randn(64, 12), tf.one_hot(np.random.randint(0,5,64),5))
```

And the above mentioned exception popped out.
By removing `kernel_regularizer='l2', ` the exception was gone and the training progress bar appeared as expected.


**Other info / logs**

Full stack trace: (You may want to skip it)

```
ValueError                                Traceback (most recent call last)
<ipython-input-3-cd3ce786e484> in <module>
     11 model.compile(optimizer='adam', loss=tfa.losses.SigmoidFocalCrossEntropy(), metrics=['accuracy'])
     12 model.summary()
---> 13 model.fit(np.random.randn(64, 12), np.random.randint(0,5,64))

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py in _method_wrapper(self, *args, **kwargs)
     64   def _method_wrapper(self, *args, **kwargs):
     65     if not self._in_multi_worker_mode():  # pylint: disable=protected-access
---> 66       return method(self, *args, **kwargs)
     67 
     68     # Running inside `run_distribute_coordinator` already.

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
    846                 batch_size=batch_size):
    847               callbacks.on_train_batch_begin(step)
--> 848               tmp_logs = train_function(iterator)
    849               # Catch OutOfRangeError for Datasets of unknown size.
    850               # This blocks until the batch has finished executing.

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
    578         xla_context.Exit()
    579     else:
--> 580       result = self._call(*args, **kwds)
    581 
    582     if tracing_count == self._get_tracing_count():

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
    625       # This is the first call of __call__, so we have to initialize.
    626       initializers = []
--> 627       self._initialize(args, kwds, add_initializers_to=initializers)
    628     finally:
    629       # At this point we know that the initialization is complete (or less

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in _initialize(self, args, kwds, add_initializers_to)
    503     self._graph_deleter = FunctionDeleter(self._lifted_initializer_graph)
    504     self._concrete_stateful_fn = (
--> 505         self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
    506             *args, **kwds))
    507 

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs)
   2444       args, kwargs = None, None
   2445     with self._lock:
-> 2446       graph_function, _, _ = self._maybe_define_function(args, kwargs)
   2447     return graph_function
   2448 

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/function.py in _maybe_define_function(self, args, kwargs)
   2775 
   2776       self._function_cache.missed.add(call_context_key)
-> 2777       graph_function = self._create_graph_function(args, kwargs)
   2778       self._function_cache.primary[cache_key] = graph_function
   2779       return graph_function, args, kwargs

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
   2655     arg_names = base_arg_names + missing_arg_names
   2656     graph_function = ConcreteFunction(
-> 2657         func_graph_module.func_graph_from_py_func(
   2658             self._name,
   2659             self._python_function,

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
    979         _, original_func = tf_decorator.unwrap(python_func)
    980 
--> 981       func_outputs = python_func(*func_args, **func_kwargs)
    982 
    983       # invariant: `func_outputs` contains only Tensors, CompositeTensors,

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in wrapped_fn(*args, **kwds)
    439         # __wrapped__ allows AutoGraph to swap in a converted function. We give
    440         # the function a weak reference to itself to avoid a reference cycle.
--> 441         return weak_wrapped_fn().__wrapped__(*args, **kwds)
    442     weak_wrapped_fn = weakref.ref(wrapped_fn)
    443 

~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    966           except Exception as e:  # pylint:disable=broad-except
    967             if hasattr(e, "ag_error_metadata"):
--> 968               raise e.ag_error_metadata.to_exception(e)
    969             else:
    970               raise

ValueError: in user code:

    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:571 train_function  *
        outputs = self.distribute_strategy.run(
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:951 run  **
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
        return fn(*args, **kwargs)
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:532 train_step  **
        loss = self.compiled_loss(
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/compile_utils.py:238 __call__
        total_loss_metric_value = math_ops.add_n(loss_metric_values)
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:180 wrapper
        return target(*args, **kwargs)
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py:3239 add_n
        return gen_math_ops.add_n(inputs, name=name)
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/ops/gen_math_ops.py:419 add_n
        _, _, _op, _outputs = _op_def_library._apply_op_helper(
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py:742 _apply_op_helper
        op = g._create_op_internal(op_type_name, inputs, dtypes=None,
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py:593 _create_op_internal
        return super(FuncGraph, self)._create_op_internal(  # pylint: disable=protected-access
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:3319 _create_op_internal
        ret = Operation(
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:1816 __init__
        self._c_op = _create_c_op(self._graph, node_def, inputs,
    /home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:1657 _create_c_op
        raise ValueError(str(e))

    ValueError: Shapes must be equal rank, but are 1 and 0
    	From merging shape 0 with other shapes. for '{{node AddN}} = AddN[N=2, T=DT_FLOAT](sigmoid_focal_crossentropy/weighted_loss/Mul, d1/kernel/Regularizer/add)' with input shapes: [32], [].
```

Full stack trace compiled with `run_eagerly=True` may be provided if requested.

Thanks~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error caused by SigmoidFocalCrossEntropy with kernel regularizer #2349

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error caused by SigmoidFocalCrossEntropy with kernel regularizer #2349

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions