-
Notifications
You must be signed in to change notification settings - Fork 6
Description
The data and command are the same as in the tutorial, but when running Anisotropy Correction of half-maps, it reports the following problem:
spisonet.py reconstruct emd_8731_half_map_1.mrc emd_8731_half_map_2.mrc --aniso_file FSC3D.mrc --mask emd_8731_msk_1.mrc --limit_res 3.5 --epochs 30 --alpha 1 --beta 0.5 --output_dir isonet_maps --gpuID 0,1,2,3 --acc_batches 2
11-20 00:52:42, INFO voxel_size 1.309999942779541
11-20 00:52:43, INFO spIsoNet correction until resolution 3.5A!
Information beyond 3.5A remains unchanged
11-20 00:52:57, INFO Start preparing subvolumes!
11-20 00:53:06, INFO Done preparing subvolumes!
11-20 00:53:06, INFO Start training!
11-20 00:53:09, INFO Port number: 51405
learning rate 0.0003
['isonet_maps/emd_8731_half_map_1_data', 'isonet_maps/emd_8731_half_map_2_data']
0%| | 0/125 [00:00<?, ?batch/s][rank1]:W1120 00:53:26.205000 139648681686848 torch/_logging/_internal.py:1034] [0/0] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank3]:W1120 00:53:26.225000 140587963545408 torch/_logging/_internal.py:1034] [0/0] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank0]:W1120 00:53:26.263000 140600396216128 torch/_logging/_internal.py:1034] [0/0] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
[rank2]:W1120 00:53:26.357000 139692869912384 torch/logging/internal.py:1034] [0/0] Profiler function <class 'torch.autograd.profiler.record_function'> will be ignored
/tmp/tmpb9ffwjno/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmpl5yntb4i/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmph2p9sytq/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmpbvg8egds/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmph4ckum7q/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmpmj2mj6b6/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmpowqgpc9/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmp_vj7apqe/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmptthkhvx/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmpjo5sie2e/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmpqvdup2d8/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
/tmp/tmp8xzss5bc/main.c:6:23: fatal error: stdatomic.h: No such file or directory
#include <stdatomic.h>
^
compilation terminated.
0%| | 0/125 [00:08<?, ?batch/s]
W1120 00:53:33.367000 139899853596480 torch/multiprocessing/spawn.py:146] Terminating process 40535 via signal SIGTERM
W1120 00:53:33.367000 139899853596480 torch/multiprocessing/spawn.py:146] Terminating process 47366 via signal SIGTERM
W1120 00:53:33.368000 139899853596480 torch/multiprocessing/spawn.py:146] Terminating process 47503 via signal SIGTERM
Traceback (most recent call last):
File "/spshared/apps/miniconda3/envs/spisonet/bin/spisonet.py", line 8, in
sys.exit(main())
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/spIsoNet/bin/spisonet.py", line 549, in main
fire.Fire(ISONET)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/fire/core.py", line 143, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/fire/core.py", line 477, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/fire/core.py", line 693, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/spIsoNet/bin/spisonet.py", line 182, in reconstruct
map_refine_n2n(halfmap1,halfmap2, mask_vol, fsc3d, alpha = alpha,beta=beta, voxel_size=voxel_size, output_dir=output_dir,
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/spIsoNet/bin/map_refine.py", line 145, in map_refine_n2n
network.train([data_dir_1,data_dir_2], output_dir, alpha=alpha,beta=beta, output_base=output_base0, batch_size=batch_size, epochs = epochs, steps_per_epoch = 1000,
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/spIsoNet/models/network_n2n.py", line 265, in train
mp.spawn(ddp_train, args=(self.world_size, self.port_number, self.model,alpha,beta,
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 282, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method="spawn")
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 238, in start_processes
while not context.join():
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 189, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/multiprocessing/spawn.py", line 76, in _wrap
fn(i, *args)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/spIsoNet/models/network_n2n.py", line 116, in ddp_train
preds = model(x1)# + noise.cuda())
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 433, in _fn
return fn(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 38, in inner
return fn(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1636, in forward
else self._run_ddp_forward(*inputs, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1454, in _run_ddp_forward
return self.module(*inputs, **kwargs) # type: ignore[index]
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/spIsoNet/models/unet.py", line 97, in forward
x, down_sampling_features = self.encoder(x)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/spIsoNet/models/unet.py", line 98, in torch_dynamo_resume_in_forward_at_97
x = self.decoder(x, down_sampling_features)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1110, in call
return hijacked_callback(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 948, in call
result = self._inner_convert(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 472, in call
return _compile(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_utils_internal.py", line 84, in wrapper_function
return StrobelightCompileTimeProfiler.profile_compile_time(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_strobelight/compile_time_profiler.py", line 129, in profile_compile_time
return func(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 817, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 231, in time_wrapper
r = func(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 636, in compile_inner
out_code = transform_code_object(code, transform)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1185, in transform_code_object
transformations(instructions, code_options)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 178, in _fn
return fn(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 582, in transform
tracer.run()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2451, in run
super().run()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 893, in run
while self.step():
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 805, in step
self.dispatch_table[inst.opcode](self, inst)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2642, in RETURN_VALUE
self._return(inst)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2627, in _return
self.output.compile_subgraph(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 1098, in compile_subgraph
self.compile_and_call_fx_graph(tx, list(reversed(stack_values)), root)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 1318, in compile_and_call_fx_graph
compiled_fn = self.call_user_compiler(gm)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 231, in time_wrapper
r = func(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 1409, in call_user_compiler
raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 1390, in call_user_compiler
compiled_fn = compiler_fn(gm, self.example_inputs())
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/dynamo/backends/distributed.py", line 565, in compile_fn
return self.backend_compile_fn(gm, example_inputs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/dynamo/repro/after_dynamo.py", line 129, in call
compiled_gm = compiler_fn(gm, example_inputs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/init.py", line 1951, in call
return compile_fx(model, inputs, config_patches=self.config)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1505, in compile_fx
return aot_autograd(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/backends/common.py", line 69, in call
cg = aot_module_simplified(gm, example_inputs, **self.kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 954, in aot_module_simplified
compiled_fn, _ = create_aot_dispatcher_function(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 231, in time_wrapper
r = func(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 687, in create_aot_dispatcher_function
compiled_fn, fw_metadata = compiler_fn(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 461, in aot_dispatch_autograd
compiled_fw_func = aot_config.fw_compiler(fw_module, adjusted_flat_args)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 231, in time_wrapper
r = func(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1410, in fw_compiler_base
return inner_compile(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/repro/after_aot.py", line 84, in debug_wrapper
inner_compiled_fn = compiler_fn(gm, example_inputs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/debug.py", line 304, in inner
return fn(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 231, in time_wrapper
r = func(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 527, in compile_fx_inner
compiled_graph = fx_codegen_and_compile(
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 831, in fx_codegen_and_compile
compiled_fn = graph.compile_to_fn()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1751, in compile_to_fn
return self.compile_to_module().call
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 231, in time_wrapper
r = func(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1680, in compile_to_module
self.codegen_with_cpp_wrapper() if self.cpp_wrapper else self.codegen()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1640, in codegen
self.scheduler.codegen()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 231, in time_wrapper
r = func(*args, **kwargs)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/scheduler.py", line 2741, in codegen
self.get_backend(device).codegen_node(node)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/codegen/cuda_combined_scheduling.py", line 69, in codegen_node
return self._triton_scheduling.codegen_node(node)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/codegen/simd.py", line 1148, in codegen_node
return self.codegen_node_schedule(node_schedule, buf_accesses, numel, rnumel)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/codegen/simd.py", line 1317, in codegen_node_schedule
src_code = kernel.codegen_kernel()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/codegen/triton.py", line 2159, in codegen_kernel
**self.inductor_meta_common(),
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/_inductor/codegen/triton.py", line 2047, in inductor_meta_common
"backend_hash": torch.utils._triton.triton_hash_with_backend(),
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/utils/_triton.py", line 63, in triton_hash_with_backend
backend = triton_backend()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/torch/utils/_triton.py", line 49, in triton_backend
target = driver.active.get_current_target()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/triton/runtime/driver.py", line 23, in getattr
self._initialize_obj()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/triton/runtime/driver.py", line 20, in _initialize_obj
self._obj = self._init_fn()
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/triton/runtime/driver.py", line 9, in _create_driver
return actives0
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/triton/backends/nvidia/driver.py", line 371, in init
self.utils = CudaUtils() # TODO: make static
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/triton/backends/nvidia/driver.py", line 80, in init
mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/triton/backends/nvidia/driver.py", line 57, in compile_module_from_src
so = build(name, src_path, tmpdir, library_dirs(), include_dir, libraries)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/triton/runtime/build.py", line 48, in build
ret = subprocess.check_call(cc_cmd)
File "/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/subprocess.py", line 369, in check_call
raise CalledProcessError(retcode, cmd)
torch.dynamo.exc.BackendCompilerFailed: backend='compile_fn' raised:
CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmptthkhvx/main.c', '-O3', '-shared', '-fPIC', '-o', '/tmp/tmptthkhvx/cuda_utils.cpython-310-x86_64-linux-gnu.so', '-lcuda', '-L/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/triton/backends/nvidia/lib', '-L/lib64', '-I/spshared/apps/miniconda3/envs/spisonet/lib/python3.10/site-packages/triton/backends/nvidia/include', '-I/tmp/tmptthkhvx', '-I/spshared/apps/miniconda3/envs/spisonet/include/python3.10']' returned non-zero exit status 1.
Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
You can suppress this exception and fall back to eager by setting:
import torch._dynamo
torch._dynamo.config.suppress_errors = True
Thanks for your help!