Skip to content

Fix: Unsafe path handling and directory pollution in Time Series examples#2236

Closed
jaytiwarihub wants to merge 7 commits intokeras-team:masterfrom
jaytiwarihub:fix-timeseries-paths
Closed

Fix: Unsafe path handling and directory pollution in Time Series examples#2236
jaytiwarihub wants to merge 7 commits intokeras-team:masterfrom
jaytiwarihub:fix-timeseries-paths

Conversation

@jaytiwarihub
Copy link

Description

This PR fixes two path handling issues in the examples/timeseries/ folder that cause fragility on Windows and cache directory pollution.

Changes

  1. timeseries_weather_forecasting.py

    • Issue: The script was using ZipFile.extractall() without a path, dumping dataset files into the user's current working directory instead of the Keras cache. It also used a relative path ("jena_climate...") which fails if the script is run from a different directory.
    • Fix: Now extracts files directly to the ~/.keras/datasets/ cache directory using os.path.dirname(zip_path) and accesses them using an absolute path.
  2. timeseries_traffic_forecasting.py

    • Issue: The script used data_dir = data_dir.rstrip("PeMSD7_Full.zip"). This is unsafe because rstrip removes any character from the set, potentially mangling the directory path if it ends with characters like p, z, i, etc.
    • Fix: Replaced with os.path.dirname(zip_path) for robust, cross-platform parent directory resolution.

Test Plan

  • Verified that datasets download and extract correctly to the Keras cache.
  • Verified that file paths resolve correctly on Windows.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @jaytiwarihub, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness and reliability of several Keras examples by addressing critical path handling issues and updating outdated resource links. The primary focus is on preventing unintended directory pollution and ensuring cross-platform compatibility for dataset extraction in time series examples, alongside refreshing image sources in generative and vision examples.

Highlights

  • Robust Path Handling in Time Series Examples: Corrected timeseries_weather_forecasting.py to extract dataset files directly into the Keras cache using absolute paths, preventing directory pollution and ensuring cross-platform compatibility.
  • Safe Directory Resolution: Replaced the unsafe rstrip() method with os.path.dirname() in timeseries_traffic_forecasting.py for reliable and cross-platform directory path extraction.
  • Updated Example Image URLs: Refreshed image URLs in fine_tune_via_textual_inversion.py, neural_style_transfer.py, and grad_cam.py to ensure examples use active and relevant content.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the unsafe path handling and directory pollution issues in the timeseries_weather_forecasting.py and timeseries_traffic_forecasting.py examples by using os.path.dirname for robust path resolution. The fixes are correct and improve the reliability of these examples.

The PR also includes changes in other examples (fine_tune_via_textual_inversion.py, neural_style_transfer.py, grad_cam.py) to update image URLs, presumably to fix broken links. While these changes are beneficial, they are outside the scope described in the PR title and description. It would be good practice to either mention them in the description or submit them as a separate PR.

I've found a few issues in fine_tune_via_textual_inversion.py related to duplicated and misplaced image URLs, and a minor cleanup suggestion for neural_style_transfer.py. Please see my detailed comments below.

@sachinprasadhs
Copy link
Collaborator

Thanks, the changes looks good, but let's not delete the files.
Like I have mentioned in the other PRs, you only make the changes to the .py files and for each file generate .ipynb and .md files by following the example here https://github.com/keras-team/keras-io/blob/master/README.md#previewing-a-new-example
Also, it's good practice to make the changes to the modified date field to let users know when the example was updated.

@sachinprasadhs
Copy link
Collaborator

Resolve the conflicts and apply the suggestion for your others PRs as well.

@jaytiwarihub
Copy link
Author

Closing this one out! .... The git history got a bit tangled here with some accidental deletions, so I'm moving to a fresh branch to keep things clean
Moving this work to PR #2272

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants