a bug of "OOM, about gpu" when I run it on more than one spark worker node

for example : I had change the files to load my own pic like [None,32,32,3] . Everything is OK, but when I set the partition=2 or 4 , 8 ...   and my computer information is gtx1070, ubuntu14.04, 8G. I also change the model init code with:
`config = tf.ConfigProto()
        config.gpu_options.allow_growth = True
        config.gpu_options.allocator_type = 'BFC'
        #config.gpu_options.per_process_gpu_memory_fraction = 0.2
        session = tf.Session(config=config)`
upon will enable several process in one gpu. 
the bug is  when the programer run some epoches ,  I  find  "nvidia-smi" 's gpu memory grows without stop.
from 800MB  to 2G ,  4G, 8G... finally show some errors like cuda OOM. 
my way to solve it:
after my check and try to fix it, I find a function leads to the GPU Memory Leak
`   def reset_gradients(self):                                                                                                                                         
         #with self.session.as_default():                                                                                                                                      
         #self.gradients = [tf.zeros(g[1].get_shape()).eval() for g in self.compute_gradients]                                            
         self.gradients = [0.0]*len(self.compute_gradients)     # my modify                          
         self.num_gradients = 0
`
though I don't the details why this change can works ,but it did.  
email :younfor@yeah.net

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a bug of "OOM, about gpu" when I run it on more than one spark worker node #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

a bug of "OOM, about gpu" when I run it on more than one spark worker node #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions