Skip to content

Enhanced Initialization, Logging, and Diagnostics for cBioPortal Docker Compose Setup #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Vaibhav701161
Copy link

@Vaibhav701161 Vaibhav701161 commented Mar 9, 2025

fixes #46
Summary of Changes:

  1. Improved Error Handling:

    • Added set -eo pipefail to all scripts to ensure termination on errors.
    • Included checks for missing files, directories, or environment variables (e.g., .env, application.properties, cgds.sql, etc.).
    • Clear error messages are displayed for failures (e.g., missing init.sh scripts, failed downloads).
  2. Enhanced Logging:

    • Introduced concise and structured log messages using [INFO], [SUCCESS], and [ERROR].
    • Added timestamps to logs for better traceability.
    • Created a debug log (debug.log) to capture detailed execution traces when needed.
  3. Optimized File Operations:

    • Implemented checks to skip redundant operations (e.g., downloading or extracting files that already exist).
    • Used temporary .env files to safely handle special characters.
  4. Cross-Platform Compatibility:

    • Added CRLF-to-LF conversion for scripts using dos2unix in the diagnostic script (debug_env.sh).
    • Ensured proper file permissions for all .sh scripts.
  5. New Diagnostic Script (debug_env.sh):

    • Checks file permissions and line endings.
    • Executes the main initialization script (./init.sh) and captures its output in a debug log.

Testing Instructions for Reviewers:

  1. Setup Environment:

    • Ensure the .env file exists in the root directory with valid entries for required variables:
      DOCKER_IMAGE_CBIOPORTAL=cbioportal/cbioportal:6.0.24
      DB_MYSQL_USERNAME=root
      DB_MYSQL_PASSWORD=password
      DB_MYSQL_URL=jdbc:mysql://localhost:3306/cbioportal
      
    • Install required tools:
      sudo apt update && sudo apt install wget dos2unix docker
      
  2. Run Diagnostic Script:

    chmod +x debug_env.sh
    ./debug_env.sh
    • Verify that all checks pass successfully.
    • Check the generated debug.log for any warnings or errors.
  3. Test Main Initialization Script (./init.sh):

    chmod +x init.sh
    ./init.sh
    • Ensure all subdirectory scripts (config/init.sh, study/init.sh, data/init.sh) execute without errors.
    • Verify that the following files are created:
      • config/application.properties
      • data/cgds.sql
      • data/seed.sql.gz
      • Extracted directories in study/ (e.g., lgg_ucsf_2014, msk_impact_2017).
  4. Simulate Failure Scenarios:

    • Missing .env file:
      mv .env .env.bak
      ./init.sh || echo "Expected failure due to missing .env"
      mv .env.bak .env
    • Network issues:
      sudo iptables -A OUTPUT -p tcp --dport 443 -j DROP
      ./init.sh || echo "Expected failure due to network issues"
      sudo iptables -D OUTPUT -p tcp --dport 443 -j DROP
  5. Verify Logs:

    • Check logs generated by each script (e.g., debug.log, timestamped logs).
    • Confirm that all success messages are present, and no unexpected errors occurred.

Screenshots/Logs Showing Successful Execution

1. Folder Structure After Execution

Folder Structure

2. Diagnostic Script Output (debug_env.sh)

=== Starting Environment Diagnostics ===
## Platform Info ##
Linux LAPTOP-BE3HEI69 5.15.167.4-microsoft-standard-WSL2 #1 SMP Tue Nov 5 00:21:55 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

🔍 Checking file permissions...
✅ All scripts have execute permissions.

🔍 Checking line endings...
✅ All scripts have LF line endings.

⚙️ Running initialization script...
=== Initialization completed successfully at Sun Mar 9 11:28:28 UTC 2025 ===

✅ All diagnostics passed successfully.

3. Main Initialization Script Output (./init.sh)

=== Starting initialization at Sun Mar 9 11:25:53 UTC 2025 ===

▶ Entering directory: config
⚙️ Running init.sh in config
✅ application.properties generated successfully.

▶ Entering directory: study
⚙️ Running init.sh in study
✅ Archive already exists: lgg_ucsf_2014.tar.gz
✅ Study directory already exists: lgg_ucsf_2014
✅ Archive already exists: msk_impact_2017.tar.gz
✅ Study directory already exists: msk_impact_2017
=== All studies processed successfully ===

▶ Entering directory: data
⚙️ Running init.sh in data
✅ Schema file (cgds.sql) fetched successfully.
✅ Seed database (seed.sql.gz) downloaded successfully.
=== Data initialization completed successfully ===

=== Initialization completed successfully at Sun Mar 9 11:28:28 UTC 2025 ===

4. Generated Files

  • config/application.properties
  • data/cgds.sql
  • data/seed.sql.gz
  • Extracted directories in study/:
    study/lgg_ucsf_2014/
    study/msk_impact_2017/
    

@Vaibhav701161
Copy link
Author

please review @zainasir @inodb .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants