Description
The second stage of training with resolution 768x768 is failing throwing the following error:
F0903 14:31:26.106397 92421 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x7fd08aa5c5cd google::LogMessage::Fail()
@ 0x7fd08aa5e433 google::LogMessage::SendToLog()
@ 0x7fd08aa5c15b google::LogMessage::Flush()
@ 0x7fd08aa5ee1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7fd08b2290e0 caffe::SyncedMemory::to_gpu()
@ 0x7fd08b2280a9 caffe::SyncedMemory::mutable_gpu_data()
@ 0x7fd08b390282 caffe::Blob<>::mutable_gpu_data()
@ 0x7fd08b363928 caffe::BaseConvolutionLayer<>::forward_gpu_gemm()
@ 0x7fd08b3eb296 caffe::ConvolutionLayer<>::Forward_gpu()
@ 0x7fd08b1f15f2 caffe::Net<>::ForwardFromTo()
@ 0x7fd08b1f1717 caffe::Net<>::Forward()
@ 0x7fd08b3a6eca caffe::Solver<>::Solve()
@ 0x7fd08b226604 caffe::P2PSync<>::Run()
@ 0x40ada0 train()
@ 0x407590 main
@ 0x7fd0899cc830 __libc_start_main
@ 0x407db9 _start
@ (nil) (unknown)
Aborted (core dumped)
Anyone came cross this error and found a fix for this?