Skip to content

Conversation

@turetske
Copy link
Collaborator

Closes #128
Closes #47

This fix implements an _ls_from_https calls which ensures that any ls call from https will be redirected to the appropriate collections url.

This code also reduces the number of director calls to get the collections url by caching that information as part of the namespace-info structure to be reused when we have prefix matches.

… url

-- Created an _ls_from_https which monkeypatches the http file system's _ls
-- Now ALL list calls will go to a collections URL
-- Also added cacheing of the collections URL in the director response to prevent constant director calls
-- Added tests which ensure the readme examples work
-- Added tests of 'get' functionality
-- Updated namespace_info tests to clear the namespace info cache between tests
Comment on lines +644 to +645
if namespace not in self._namespace_cache:
self._namespace_cache[namespace] = _CacheManager([], director_response)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a namespace already exists in _namespace_cache but has no director_response yet, you silently skip updating it. That can leave stale or incomplete namespace metadata in place.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My issue with these integration is that:

  • They depend on live OSDF infrastructure
  • Failures may be due to network, not code

I am not suggesting to get rid of them but to maybe make them optional

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Listings not always using the collections url Reduce director calls for get_dirlist_url

2 participants