Skip to content

Add suggested paths to FileNotFound errors from C code #2558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 28 commits into from

Conversation

MyreMylar
Copy link
Member

Fixes #2485

Test code:

import pygame

cat_surf = pygame.image.load("imajes/kool_staff/gat.png")

Directory structure:
image

New Output:

Traceback (most recent call last):
  File "C:\Users\dan\Programming\scrap\load_cat_png.py", line 2, in <module>
    cat_surf = pygame.image.load("imajes/kool_staff/gat.png")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: File not found at path: 'imajes/kool_staff/gat.png', did you mean: 'images/cool_stuff/cat.png'?

@MyreMylar MyreMylar marked this pull request as ready for review November 11, 2023 15:23
@MyreMylar MyreMylar requested a review from a team as a code owner November 11, 2023 15:23
@Starbuck5
Copy link
Member

I haven't looked this over in detail yet but I notice this PR removes the "not found in working directory xyz" message, which is something I'm fond of (especially since I added it).

@MyreMylar
Copy link
Member Author

I haven't looked this over in detail yet but I notice this PR removes the "not found in working directory xyz" message, which is something I'm fond of (especially since I added it).

I think adding that back in to the relative path will be no problem :)

@MyreMylar MyreMylar added the rwops SDL's IO loading/streaming code label Nov 16, 2023
Copy link
Member

@zoldalma999 zoldalma999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to look at this pr for a bit now, finally had the time.

Since there were a lot of PyObject_CallMethod (and similar) calls, decided to rewrite this code in python to see if the logic is alright, and even that python code is fairly long. Do you think we could (/should) just do this in python instead of c?

Also, is there a reason the suggest_valid_path function raises errors? We seemingly just ignore that everywhere it is used and set something generic.

Code
import difflib
import os


def add_to_path(existing_path, new_part):
    # handles joining paths chunks which may include windows drive letters,
    # windows drive letters require special handling to work correctly.
    temp_path = None

    if len(existing_path) > 0:
        if existing_path[-1] == ":":
            # handle windows drive letters
            temp_path = os.path.join(existing_path, "\\", new_part)
    else:
        # adding first item to a path, need to force adding a slash if
        # it is not windows drive letter
        if existing_path[-1] != ":":
            # not a windows drive letter, so add a slash at the start
            temp_path = os.path.join(existing_path, os.path.sep, new_part)

    if temp_path is None:
        temp_path = os.path.join(existing_path, new_part)

    return temp_path


def get_contents_at_path(path_to_check):
    # handles obtaining the contents of a directory.
    # windows drive letters require special handling to work correctly.
    directories_found = None

    if len(path_to_check) > 0:
        if path_to_check[-1] == ":":
            # handle Windows drive letters
            path_to_check = os.path.join(path_to_check, "\\")

        directories_found = os.listdir(path_to_check)

    return directories_found


def suggest_valid_path(original_path, starting_path):
    # To suggest a valid path from an invalid one we first normalise & split
    # the provided path into components (directory/drive letter/file chunks).
    path_components = os.path.normpath(original_path).split(os.path.sep)
    if len(path_components) > 0:
        # We have a path with at least some content, the file will be the last
        # chunk so grab that seperately
        file_comp = path_components[-1]
        longest_valid_path = starting_path
        temp_path = None

        # loop through all the non-file path components rebuilding the path
        # component-by-component and checking at each addition if the formed
        # path is valid.
        # If the path is not valid at any step, we search for the closest
        # match and add that instead.
        for path_comp in path_components[:-1]:
            # Skip empty path components
            if len(path_comp) > 0:
                temp_path = add_to_path(longest_valid_path, path_comp)
                if os.path.isdir(temp_path):
                    # path step is valid, continue the loop
                    longest_valid_path = os.path.normpath(longest_valid_path)
                else:
                    # path step is invalid, look for petential alternatives
                    potential_dirs = get_contents_at_path(longest_valid_path)
                    if potential_dirs == None:
                        raise FileNotFoundError(
                            f"No Potential dirs at path: '{longest_valid_path}'?"
                        )

                    actual_dirs = []
                    for potential_dir in potential_dirs:
                        temp_potential_dir = add_to_path(
                            longest_valid_path, potential_dir
                        )

                        if os.path.isdir(temp_potential_dir):
                            actual_dirs.append(temp_potential_dir)

                    # we've found some actual directories that could be our
                    # path step pick the one closest to what we were given
                    # using difflib
                    if len(actual_dirs) > 0:
                        closest_matches = difflib.get_close_matches(
                            path_comp, actual_dirs
                        )
                        if len(closest_matches) > 0:
                            longest_valid_path = add_to_path(
                                longest_valid_path, closest_matches[0]
                            )
                    else:
                        # no directories to match against,
                        # default to simple error
                        raise FileNotFoundError(
                            f"No actual dirs at path '{longest_valid_path}'?"
                        )

        # We have the longest valid directory path that is the most similar
        # to the given path. Now we look for the most similar file on that
        # path.
        potential_files = get_contents_at_path(longest_valid_path)
        actual_files = []
        for potential_file in potential_files:
            if os.path.isfile(potential_file):
                actual_files.append(potential_file)

        if len(actual_files) > 0:
            closest_matches = difflib.get_close_matches(file_comp, actual_files)
            if len(closest_matches) > 0:
                longest_valid_path = add_to_path(longest_valid_path, closest_matches[0])

        else:
            # no files to match against,
            # default to simple error
            raise FileNotFoundError(f"No Actual files at path '{longest_valid_path}'?")

        longest_valid_path = os.path.normpath(longest_valid_path)
        return longest_valid_path

    else:
        raise FileNotFoundError(
            f"unable to split path: '{os.path.normpath(original_path)}'?"
        )

@MyreMylar MyreMylar requested a review from a team as a code owner May 29, 2024 19:02
@MyreMylar
Copy link
Member Author

Since there were a lot of PyObject_CallMethod (and similar) calls, decided to rewrite this code in python to see if the logic is alright, and even that python code is fairly long.

Thanks for doing this!

Do you think we could (/should) just do this in python instead of c?

Maybe! When I originally started this PR I was going to approach it with a C library for Ratcliff-Obershelp similarity, but then realised how annoying manipulating paths is in C and discovered that I could use the python os module from C code from other code in rwobject.c and then figured I could also use the difflib module in the same way.

This isn't particularly performance sensitive code either way as it only activates when you are in an 'oops there was a fuckup' situation anyway, it is more intended to help out newbies to python & pygame deal with file path mistakes.

Also, is there a reason the suggest_valid_path function raises errors? We seemingly just ignore that everywhere it is used and set something generic.

I think I added these while I was debugging the function; they are mostly relevant to someone using the function not to the end user - so probably mainly for that, but it has been a long time since I wrote this.

@MyreMylar
Copy link
Member Author

If anybody wants to re-implement this in python based on @zoldalma999's code above - and thinks they make it work with rwobject.c then feel free to do so.

I will leave this open as it is for now as I don't immediately know how to go about that.

@MyreMylar
Copy link
Member Author

This was a fun idea, but I don't think this is going to get merged as I'm not going to rewrite it in python, so I'll close it.

@MyreMylar MyreMylar closed this Sep 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rwops SDL's IO loading/streaming code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use Ratcliff-Obershelp string matcher to check file names and paths
3 participants