Skip to content

[BUG] Watchdog reset with DWIN interface - SD card issue #28064

@TheAndr0id

Description

@TheAndr0id

Did you test the latest bugfix-2.1.x code?

Yes, and the problem still exists.

Bug Description

This is connected with issue #28003 which of course I closed yesterday.
(I found the trigger by accident about 4 hours after I closed that ticket...)

I'll include the details of what I did and how to replicate the issue below, but it basically this is due to how SD cards are read and indexed. The process is taking too long and is allowing the watchdog timer to expire.

Bug Timeline

Old

Expected behavior

Not to reboot the printer continuously.

Actual behavior

From the serial port:

Marlin 2.1.2.5
echo: Last Updated: 2024-11-18 | Author: (none, default config)
echo: Compiled: Sep 20 2025
echo: Free Memory: 51391  PlannerBufferBytes: 1792
echo:SD card ok
Unified Bed Leveling System v1.01 inactive
BL24CXX Check succeeded!
start
 Watchdog Reset
Marlin 2.1.2.5
echo: Last Updated: 2024-11-18 | Author: (none, default config)
echo: Compiled: Sep 20 2025
echo: Free Memory: 51391  PlannerBufferBytes: 1792
echo:SD card ok
Unified Bed Leveling System v1.01 inactive
BL24CXX Check succeeded!
start
 Watchdog Reset
.
.
.

Steps to Reproduce

Reproduction is going to be difficult unless you have a 20+ year old SD card or a LOT of patience.

See below for details.

Version of Marlin Firmware

2.1.2.5

Printer model

Ender 3v2

Electronics

Stock

LCD/Controller

DWIN

Other add-ons

none

Bed Leveling

UBL Bilinear mesh

Your Slicer

Prusa Slicer

Host Software

None

Don't forget to include

  • A ZIP file containing your Configuration.h and Configuration_adv.h.

Additional information & file uploads

First, my config files:

Configurations.zip

Second, the TL;DR:

Slow SD card causes long delay in sorting process which leads to watchdog timer expiring. Need to add watchdog_refresh() calls into CardReader::presort() function in cardreader.cpp. watchdog_refresh() use should be reviewed.

The long story:

Yesterday I decided it was time to add input shaping. Everything was working well, so let's break sh!t. Enabled the defines, built the firmware, flashed it to the printer. Reset my EEPROM settings, saved, rebooted the printer and confirmed the settings were good.

Went and had supper came back two hours later, hooked up my SD card reader and rebooted ready to start running input shaping calibration tests.

Endless watchdog reset.

Bad language occurred.

When I relaxed, I started to backtrack.

Unplugged my SD card to micoSD card adapter and rebooted. No reset. Maybe I damaged the adapter? Plugged it back in and removed the SD card. No reset. Inserted the SD card, reads fine. Rebooted, watchdog reset.

Removed the Input Shaping and re-flashed that firmware. No watchdog reset, SD card in or out. Run the test through bugfix-2.1.x and it doesn't reset... Ok diff, what's changed? (turns out a lot - should bugfix have new features?)

I test almost all of the settings around the SD card, and I've come to some conclusions.

  1. The SD card I'm using is a 1G Sandisk UII, circa early 2000s. Yes, it's over 20 years old and working fine. (and I have a hot spare for when it dies! 😄) This is not a fast SD card by any current standard, but it's more than fast enough for my Ender 3v2.
  2. The Input Shaping in bugfix has significant enhancements and optimizations from 2.1.2.5. I didn't dig into this, but I'm guessing it improves some initialization routines that happen during startup.
  3. The watchdog doesn't happen every time (around 90%). I believe it is because the initialization of input shaping in 2.1.2.5 is slower and thus the SD card file read and update is taking just a little more time than the watchdog has. I don't believe this is an Input Shaping issue.
  4. By default POWER_LOSS_RECOVERY is enabled for the Ender 3v2 configuration. First off, this feature is horribly broken and it can't work due to it's design (that's a rant for another day). This feature forces the SD card to be mounted at boot. After that mount the watchdog is started so it shouldn't be an issue, but it turns out it is.
  5. One of the last startup actions is to call DWIN_InitScreen() which through three or four different function calls eventually calls the presort() function.
  6. The CardReader::presort() function uses a bubble sort (really?) to load and, well sort, the file names on the SD card. If it's already done that(which POWER_LOSS_RECOVERY causes), it first clears the file list from RAM and then goes through the SD card's FAT to figure out what's there and recreates the sorted list. The more files on the SD card, the longer this process will take (and the longer it will take to clear the list from the first pass). In my case, it looks like the flush_presort() call is the straw that breaks my printer. If the SD card is not mounted at boot, then the watchdog doesn't get reset, but I'm certain that if I add more files to my SD card, it will.

Here's my thoughts on a possible fixes:

  • Remove the sort from the POWER_LOSS_RECOVERY process. It's only looking for a single file to know it needs to restart and it doesn't need to sort the file list to find one file. Unfortunately this is just putting lipstick on a pig. Power loss recovery doesn't work, causes print issues and is just a bad implementation that shouldn't have any more time invested in it. I have now disabled this feature in my firmware.
  • Speed up the sort. I do not know that it is available for the platformio environment, but quicksort(qsort()) has been a standard C function for decades. It works, and is mostly faster than bubblesort. This is just a bandaid solution as any sort run time is function of data. More files you add, the longer the sort will take, doesn't matter what sort is used.
  • Watchdog is an essential service, but it's not being refreshed/reset. Currently there are a couple calls in various UIs (ProUI, extui) and in the temperature routines. This is good, but in long running loops it also needs to be reset as well. There's no way to determine how long a file listing will take to sort as the number of files will always be unknown. The watchdog needs a reset here (and any other long running processes).

Yes, my SD card is old and slow (but not yet busted). But in this case it has identified a critical failure. In the previous issue (#28003) it caused that reset as I had a large number of files on the SD card. My work process is/was when a print job ends I can reach the SD card slot from my chair, but not the control knob. I'd pull out the SD card, add another file and put it back in. When ready to start the print I get up, go over and select "confirm" on the print complete screen. This would then reload the file list and overflow the watchdog. A few weeks back I cleaned up the old files off my SD card. The problem wasn't the print complete process, it was the number of files on the SD card. (we are talking under 100 files on the SD card - it is only 1G...)

This problem will surface with a newer SD card. It might take hundreds or thousands of files on that card, but it will happen. Since newer SD cards are at least 16G, this is a real possibility.
( I know I'd be tempted to have a "GCode Archive" available on my printer...)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions