Skip to content

[BUG] dashboard occasionally unresponsive  #2586

@dirkpetersen

Description

@dirkpetersen

Within 5-6 days of operations my dashboard (2.4 installed with nvflare dashboard --cloud aws docker install) became unresponsive (https connection timeout) . Unfortunately there was nothing I could find in the logs so i went ahead a created a monitoring script, that restarts the container. Not ideal but works

https://github.com/dirkpetersen/nvflare-cancer#about-2-restart-on-error

#!/bin/bash

url="https://myproject.mydomain.edu"   # Set the website URL
search_string='name="viewport"'        # Set the search string in the HTML source
timeout_duration=15                    # Set the timeout duration in seconds
date=$(date)                           # Get the current date for logging

# Check if the search string exists in the HTML source code
if curl -k -s -m $timeout_duration $url | grep -q "$search_string"; then
    echo "${date}: OK ! The search string '$search_string' was found in the HTML source code of $url"
else
    echo "${date}: Error ! The search string '$search_string' was not found in the HTML source code of $url or the connection timed out after $timeout_duration seconds"
    # Run the commands if the search string is not found or the connection times out
    echo "${date}: Restarting NVFlare dashboard"
    $HOME/.local/bin/nvflare dashboard --stop
    sleep 3
    $HOME/.local/bin/nvflare dashboard --start -f ~
fi

adding an hourly cron job:

(crontab -l 2>/dev/null; echo "59 * * * * \$HOME/monitor.sh >> /var/tmp/nvflare-monitor.log 2>&1") | crontab

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions