您好!请问在解冻backbone后,开始第11个epoch,会出现以下情况:
[2025-04-02 20:17:47,978-rk0-train.py#247] epoch: 11
[2025-04-02 20:17:47,978-rk0-train.py#251] epoch 11 lr 4.826744322208125e-06
[2025-04-02 20:17:47,978-rk0-train.py#251] epoch 11 lr 0.0004826744322208124
[2025-04-02 20:17:47,979-rk0-train.py#251] epoch 11 lr 0.0004826744322208124
[2025-04-02 20:17:47,979-rk0-train.py#251] epoch 11 lr 0.0004826744322208124
[2025-04-02 20:18:11,803-rk0-train.py#300] Epoch: [11][20/642] lr: 0.000483
batch_time: 0.543413 (1.253038) data_time: 0.082757 (0.569095)
cls_loss: 1581062.750000 (89252849.141115) loc_loss: 0.999936 (0.962544)
total_loss: 1581063.750000 (89252850.067277)
[2025-04-02 20:18:11,803-rk0-log_helper.py#105] Progress: 6440 / 12840 [50%], Speed: 1.253 s/iter, ETA 0:02:13 (D:H:M)
[2025-04-02 20:18:34,711-rk0-train.py#300] Epoch: [11][40/642] lr: 0.000483
batch_time: 0.464469 (1.305685) data_time: 0.000000 (0.521625)
cls_loss: 4303667200.000000 (219855600.983749) loc_loss: 0.998354 (0.957559)
total_loss: 4303667200.000000 (219855601.863008)
[2025-04-02 20:18:34,711-rk0-log_helper.py#105] Progress: 6460 / 12840 [50%], Speed: 1.306 s/iter, ETA 0:02:18 (D:H:M)
[2025-04-02 20:18:59,179-rk0-train.py#300] Epoch: [11][60/642] lr: 0.000483
batch_time: 0.462952 (1.358881) data_time: 0.000000 (0.453902)
cls_loss: 1192478.000000 (530384016.806605) loc_loss: 0.999114 (0.956494)
total_loss: 1192479.000000 (530384017.654972)
[2025-04-02 20:18:59,180-rk0-log_helper.py#105] Progress: 6480 / 12840 [50%], Speed: 1.359 s/iter, ETA 0:02:24 (D:H:M)
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
请问应该怎么解决呢
您好!请问在解冻backbone后,开始第11个epoch,会出现以下情况:
[2025-04-02 20:17:47,978-rk0-train.py#247] epoch: 11
[2025-04-02 20:17:47,978-rk0-train.py#251] epoch 11 lr 4.826744322208125e-06
[2025-04-02 20:17:47,978-rk0-train.py#251] epoch 11 lr 0.0004826744322208124
[2025-04-02 20:17:47,979-rk0-train.py#251] epoch 11 lr 0.0004826744322208124
[2025-04-02 20:17:47,979-rk0-train.py#251] epoch 11 lr 0.0004826744322208124
[2025-04-02 20:18:11,803-rk0-train.py#300] Epoch: [11][20/642] lr: 0.000483
batch_time: 0.543413 (1.253038) data_time: 0.082757 (0.569095)
cls_loss: 1581062.750000 (89252849.141115) loc_loss: 0.999936 (0.962544)
total_loss: 1581063.750000 (89252850.067277)
[2025-04-02 20:18:11,803-rk0-log_helper.py#105] Progress: 6440 / 12840 [50%], Speed: 1.253 s/iter, ETA 0:02:13 (D:H:M)
[2025-04-02 20:18:34,711-rk0-train.py#300] Epoch: [11][40/642] lr: 0.000483
batch_time: 0.464469 (1.305685) data_time: 0.000000 (0.521625)
cls_loss: 4303667200.000000 (219855600.983749) loc_loss: 0.998354 (0.957559)
total_loss: 4303667200.000000 (219855601.863008)
[2025-04-02 20:18:34,711-rk0-log_helper.py#105] Progress: 6460 / 12840 [50%], Speed: 1.306 s/iter, ETA 0:02:18 (D:H:M)
[2025-04-02 20:18:59,179-rk0-train.py#300] Epoch: [11][60/642] lr: 0.000483
batch_time: 0.462952 (1.358881) data_time: 0.000000 (0.453902)
cls_loss: 1192478.000000 (530384016.806605) loc_loss: 0.999114 (0.956494)
total_loss: 1192479.000000 (530384017.654972)
[2025-04-02 20:18:59,180-rk0-log_helper.py#105] Progress: 6480 / 12840 [50%], Speed: 1.359 s/iter, ETA 0:02:24 (D:H:M)
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
请问应该怎么解决呢