how to pmap sync batch_stats? maybe a bug? #3201

zhenlan0426 · 2023-07-14T18:56:48Z

zhenlan0426
Jul 14, 2023

@cgarciae kindly point me to this example for pmap with batchNorm. And the batch_stats are sync across devices with pmean.

  # sync batch stats
  batch_stats = jax.lax.pmean(variables['batch_stats'], "device")
  variables = variables.copy({'batch_stats': batch_stats})

And this is how batch_stats is updated inside BatchNorm module,

ra_mean.value = self.momentum * ra_mean.value + (1 -self.momentum) * mean
ra_var.value = self.momentum * ra_var.value + (1 - self.momentum) * var

The pmean of var across device wont be the same as the true var calculated over the entire data (device * batch) if we dont divided the data into device. This way the variance stats in batchNorm seems off.

zhenlan0426 · 2023-07-14T19:03:56Z

zhenlan0426
Jul 14, 2023
Author

This simple example shows this empirically.

import numpy as np
x = np.random.randn(128)
# var over the entire data, var over device first and then average across device
print(np.var(x),x.reshape(-1,8).var(0).mean())

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

how to pmap sync batch_stats? maybe a bug? #3201

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

how to pmap sync batch_stats? maybe a bug? #3201

Uh oh!

zhenlan0426 Jul 14, 2023

Replies: 1 comment

Uh oh!

zhenlan0426 Jul 14, 2023 Author

zhenlan0426
Jul 14, 2023

zhenlan0426
Jul 14, 2023
Author