-
Notifications
You must be signed in to change notification settings - Fork 105
fix(bundler): support git gems #1629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 3 commits
be563f4
4e07504
19e10d6
89de0a7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -271,17 +271,17 @@ nokogiri-1.16.6.gem | |
| tzinfo-2.0.6.gem | ||
| ``` | ||
|
|
||
| Notice that all the `.gem` dependencies are kept in their original format, and Git dependencies are just plain clones | ||
| Notice that all the `.gem` dependencies are kept in their original format, and Git dependencies are bare git clones | ||
|
eskultety marked this conversation as resolved.
|
||
| of the repository placed in a folder. For Git dependencies, the folder name must match this specific | ||
| [format](https://github.com/rubygems/rubygems/blob/3da9b1dda0824d1d770780352bb1d3f287cb2df5/bundler/lib/bundler/source/git.rb#L130): | ||
|
|
||
| ```ruby | ||
| "#{base_name}-#{shortref_for_path(revision)}" | ||
| ``` | ||
|
|
||
| The name of the directory **must come from the Git URL**, not the actual name of the gem, and the cloned folder must | ||
|
eskultety marked this conversation as resolved.
|
||
| contain unpacked source code. Any other format will cause bundler to try to re-download the repository, causing the | ||
| build to fail. | ||
| The name of the directory **must come from the Git URL**, not the actual name of the gem. The clones are bare | ||
| repositories used as local remotes via git's `url.insteadOf` mechanism (see | ||
| [Offline installs involving git dependencies](#offline-installs-involving-git-dependencies)). | ||
|
eskultety marked this conversation as resolved.
eskultety marked this conversation as resolved.
|
||
|
|
||
| ##### Multiple Gems in a single repository | ||
|
|
||
|
|
@@ -389,8 +389,6 @@ BUNDLE_DEPLOYMENT=true | |
| BUNDLE_NO_PRUNE=true | ||
| BUNDLE_ALLOW_OFFLINE_INSTALL=true | ||
| BUNDLE_DISABLE_VERSION_CHECK=true | ||
| BUNDLE_DISABLE_LOCAL_BRANCH_CHECK=true | ||
|
eskultety marked this conversation as resolved.
|
||
| BUNDLE_DISABLE_LOCAL_REVISION_CHECK=true | ||
| ``` | ||
|
|
||
| - **BUNDLE_CACHE_PATH**: The directory that Bundler will place cached gems in when running bundle package, and that | ||
|
|
@@ -415,16 +413,6 @@ package cache. | |
| - **BUNDLE_DISABLE_VERSION_CHECK**: Stop Bundler from checking if a newer Bundler version is | ||
| available on rubygems.org. | ||
|
|
||
| - **BUNDLE_DISABLE_LOCAL_REVISION_CHECK**: Allow Bundler to use a local git override without a | ||
| branch specified in the Gemfile | ||
|
|
||
| - **BUNDLE_DISABLE_LOCAL_BRANCH_CHECK**: Allow Bundler to use a local git override without checking | ||
| if the revision present in the lockfile is present in the repository. | ||
|
|
||
| - **BUNDLE_LOCAL__<GEM_NAME>**: Instead of checking out the remote git repository for GEM_NAME, | ||
| the local git directory override will be used. See below for more information on Bundler's git | ||
| dependency handling. | ||
|
|
||
| For more information, see Bundler's [documentation](https://bundler.io/v2.5/man/bundle-config.1.html). | ||
|
|
||
| #### Offline installation pitfalls | ||
|
|
@@ -465,30 +453,29 @@ flag](https://github.com/rubygems/rubygems/issues/8265) via configuration option | |
| making use of the deployment mode.** | ||
|
|
||
| ##### Offline installs involving git dependencies | ||
| Bundler seems to follow a different approach when it comes to git dependencies since in its default | ||
| configuration it always tries to fetch the source from the remote to ensure the application is built | ||
| against the correct branch/revision. This argument is indirectly supported by the | ||
| [docs](https://bundler.io/guides/deploying.html#deploying-your-application), more specifically: | ||
|
|
||
| > If you have run bundle pack, checked in the vendor/cache directory, and do not have any git gems, | ||
| Bundler will not contact the internet while installing your bundle. | ||
|
|
||
| This is a problem for hermetic builds and as such setting `BUNDLE_DEPLOYMENT` alone doesn't help | ||
| and we need more settings. In order to overcome this behavioral trait, we need to follow the | ||
| recommendation in the [config](https://bundler.io/v2.5/man/bundle-config.1.html#LOCAL-GIT-REPOS) | ||
| docs and override each git dependency with the location on the disk we fetched the git dependency | ||
| to and tell bundler about it with the `BUNDLE_LOCAL__<GEM_NAME>` configuration key. | ||
| However, this still isn't enough for Bundler to honour offline installs with git dependencies, | ||
| because then it's trying to enforce further checks as outlined in the | ||
| [docs](https://bundler.io/v2.5/man/bundle-config.1.html#LOCAL-GIT-REPOS): | ||
|
|
||
| >Bundler does many checks to ensure a developer won't work with invalid references. Particularly, | ||
| >we force a developer to specify a branch in the Gemfile in order to use this feature. If the | ||
| >branch specified in the Gemfile and the current branch in the local git repository do not | ||
| >match, Bundler will abort. | ||
|
|
||
| Therefore, we additionally need to enforce both `BUNDLE_DISABLE_LOCAL_BRANCH_CHECK` and | ||
| `BUNDLE_DISABLE_LOCAL_REVISION_CHECK`. | ||
| Bundler always tries to fetch git dependencies from the remote, so `BUNDLE_DEPLOYMENT` alone | ||
| doesn't prevent network access. | ||
|
Comment on lines
-456
to
+457
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With this hunk here I'd like to be more thorough though - it should go to the earlier commit switching to bare git clones. It's fine to keep the section in that commit brief, we'll add the rest (below) in this commit. |
||
|
|
||
| To solve this without coupling to bundler's internal cache format | ||
| (which has [changed across versions](https://github.com/ruby/rubygems/commit/7d6b6316)), we use git's | ||
| [`url.insteadOf`](https://git-scm.com/docs/git-config#Documentation/git-config.txt-urlltbasegtinsteadOf) | ||
| via `GIT_CONFIG_COUNT`/`KEY`/`VALUE` environment variables to redirect each remote URL to the | ||
|
Comment on lines
+459
to
+462
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It'd be best to have a brief example on this so it's easier for imagination. |
||
| pre-fetched local bare clone. This operates at the git transport layer — bundler runs its standard | ||
| flow but git silently fetches from disk instead of the network. | ||
|
|
||
| For example, given a git dependency on `https://github.com/3scale/json-schema`, hermeto sets: | ||
|
|
||
| ``` | ||
| GIT_CONFIG_COUNT=2 | ||
| GIT_CONFIG_KEY_0=url.file:///output/deps/bundler/json-schema-26487618a684/.insteadOf | ||
| GIT_CONFIG_VALUE_0=https://github.com/3scale/json-schema | ||
| GIT_CONFIG_KEY_1=protocol.file.allow | ||
| GIT_CONFIG_VALUE_1=always | ||
| ``` | ||
|
|
||
| The `protocol.file.allow=always` entry ensures git permits the `file://` protocol after the URL | ||
| rewrite. See the [git documentation](https://git-scm.com/docs/git-config#Documentation/git-config.txt-protocolallow) | ||
| for details. | ||
|
Comment on lines
+459
to
+478
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this hunk discussing the actual solution we've implemented should remain here as is. :) |
||
|
|
||
| ##### Offline installation using deployment mode | ||
| Deployment mode is a way of vendoring one's code along with the dependencies. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -35,9 +35,6 @@ | |
| def fetch_bundler_source(request: Request) -> RequestOutput: | ||
| """Resolve and process all bundler packages.""" | ||
| components: list[Component] = [] | ||
| environment_variables: list[EnvironmentVariable] = ( | ||
| _prepare_environment_variables_for_hermetic_build() | ||
| ) | ||
| project_files: list[ProjectFile] = [] | ||
| git_paths = [] | ||
|
|
||
|
|
@@ -50,9 +47,11 @@ def fetch_bundler_source(request: Request) -> RequestOutput: | |
| ) | ||
| components.extend(_comps) | ||
| git_paths.extend(_git_paths) | ||
| project_files.append( | ||
| _prepare_for_hermetic_build(request.source_dir, request.output_dir, git_paths) | ||
|
|
||
| environment_variables: list[EnvironmentVariable] = ( | ||
| _prepare_environment_variables_for_hermetic_build(git_paths) | ||
| ) | ||
| project_files.append(_prepare_for_hermetic_build(request.source_dir, request.output_dir)) | ||
|
|
||
| annotations = [] | ||
| if backend_annotation := create_backend_annotation(components, "bundler"): | ||
|
|
@@ -65,17 +64,17 @@ def fetch_bundler_source(request: Request) -> RequestOutput: | |
| ) | ||
|
|
||
|
|
||
| # Aliases for git dependency name and git dependency name as | ||
| # it is written to file system: | ||
| # Aliases for git dependency name, file system name, and remote URL: | ||
| DepName = str | ||
| FSDepName = str | ||
| DepURL = str | ||
|
|
||
|
|
||
| def _resolve_bundler_package( | ||
| package_dir: RootedPath, | ||
| output_dir: RootedPath, | ||
| binary_filters: BundlerBinaryFilters | None = None, | ||
| ) -> tuple[list[Component], list[tuple[DepName, FSDepName]]]: | ||
| ) -> tuple[list[Component], list[tuple[DepName, FSDepName, DepURL]]]: | ||
| """Process a request for a single bundler package.""" | ||
| deps_dir = output_dir.join_within_root("deps", "bundler") | ||
| deps_dir.path.mkdir(parents=True, exist_ok=True) | ||
|
|
@@ -110,7 +109,7 @@ def _resolve_bundler_package( | |
| files_to_download[dep.remote_location] = dep.download_location(deps_dir) | ||
| case GitDependency(): | ||
| dep.download_to(deps_dir) | ||
| git_paths.append((dep.name, dep.repo_name + "-" + dep.ref[:12])) | ||
| git_paths.append((dep.name, dep.repo_name + "-" + dep.ref[:12], str(dep.url))) | ||
|
|
||
| c = Component(name=dep.name, version=dep.version, purl=dep.purl, properties=properties) | ||
| components.append(c) | ||
|
|
@@ -196,16 +195,51 @@ def _get_repo_name_from_origin_remote(package_dir: RootedPath) -> str: | |
| return str(resolved_path) | ||
|
|
||
|
|
||
| def _prepare_environment_variables_for_hermetic_build() -> list[EnvironmentVariable]: | ||
| return [ | ||
| def _prepare_environment_variables_for_hermetic_build( | ||
| git_paths: list[tuple[DepName, FSDepName, DepURL]] | None = None, | ||
| ) -> list[EnvironmentVariable]: | ||
| env_vars = [ | ||
| # Contains path to a directory where a new config could be found. | ||
| EnvironmentVariable(name="BUNDLE_APP_CONFIG", value="${output_dir}/" + CONFIG_OVERRIDE), | ||
| ] | ||
| if git_paths: | ||
| # Redirect git remote URLs to pre-fetched local clones via git's | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A non-blocking request: could you please move lines 206 to 221 into a separate helper, maybe env_vars = [
# Contains path to a directory where a new config could be found.
EnvironmentVariable(name="BUNDLE_APP_CONFIG", value="${output_dir}/" + CONFIG_OVERRIDE),
]
env_vars.extend(_produce_git_redirection_variables(git_paths))
return env_vars |
||
| # GIT_CONFIG_COUNT/KEY/VALUE mechanism. This injects url.insteadOf | ||
| # entries without replacing the global git config. | ||
| # See: https://git-scm.com/docs/git-config#ENVIRONMENT | ||
| _check_for_duplicate_git_urls(git_paths) | ||
| # (key, value) pairs for GIT_CONFIG_KEY_N / GIT_CONFIG_VALUE_N | ||
| git_config: list[tuple[str, str]] = [] | ||
| for _packname, dirname, url in git_paths: | ||
| clone_file_url = "file://${output_dir}/deps/bundler/" + dirname + "/" | ||
| git_config.append((f"url.{clone_file_url}.insteadOf", url)) | ||
| git_config.append(("protocol.file.allow", "always")) | ||
|
|
||
| env_vars.append(EnvironmentVariable(name="GIT_CONFIG_COUNT", value=str(len(git_config)))) | ||
| for idx, (key, value) in enumerate(git_config): | ||
| env_vars.append(EnvironmentVariable(name=f"GIT_CONFIG_KEY_{idx}", value=key)) | ||
| env_vars.append(EnvironmentVariable(name=f"GIT_CONFIG_VALUE_{idx}", value=value)) | ||
| return env_vars | ||
|
|
||
|
|
||
| def _check_for_duplicate_git_urls( | ||
| git_paths: list[tuple[DepName, FSDepName, DepURL]], | ||
| ) -> None: | ||
| """Raise if multiple git deps share the same URL with different revisions.""" | ||
| url_to_dirs: dict[str, set[str]] = {} | ||
| for _packname, dirname, url in git_paths: | ||
| url_to_dirs.setdefault(url, set()).add(dirname) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Non-blocking nitpicks: from collections improt defaultdict
...
...
...
urls_to_dirs = defaultfict(set)
for _, dirname, url in git_paths:
urls_to_dirs[url].add(dirname)It would also be nice to find all offending url:dirs pairs in one go (there could be more than one and users would get really upset if they were to fish for them one at a time): offenders = [(url, dirs) for url, dirs in urls_to_dirs.items() if len(dirs) > 1]
if offenders:
raise ... |
||
| for url, dirs in url_to_dirs.items(): | ||
| if len(dirs) > 1: | ||
| raise UnsupportedFeature( | ||
| f"Multiple git dependencies point to the same repository ({url}) " | ||
| f"but use different revisions: {', '.join(sorted(dirs))}. " | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use populate |
||
| "This is not supported because git's url.insteadOf redirect " | ||
| "can only map a repository URL to a single local clone." | ||
| ) | ||
|
|
||
|
|
||
| def _prepare_for_hermetic_build( | ||
| source_dir: RootedPath, output_dir: RootedPath, git_paths: list | None = None | ||
| ) -> ProjectFile: | ||
| def _prepare_for_hermetic_build(source_dir: RootedPath, output_dir: RootedPath) -> ProjectFile: | ||
| """Prepare a package for hermetic build by injecting necessary config.""" | ||
| potential_bundle_config = source_dir.join_within_root(".bundle/config").path | ||
| hermetic_config = dedent( | ||
|
|
@@ -218,30 +252,6 @@ def _prepare_for_hermetic_build( | |
| BUNDLE_VERSION: "system" | ||
| """ | ||
| ) | ||
| # Note: if a package depends on a git revision then the following variables | ||
| # are necessary for a hermetic build: | ||
| # BUNDLE_DISABLE_LOCAL_BRANCH_CHECK | ||
| # BUNDLE_DISABLE_LOCAL_REVISION_CHECK | ||
| # because otherwise some (potentially all, depending on exact set of | ||
| # ecosystem components versions, environment variables and celestial | ||
| # alignment) Bundler versions will try to fetch the latest changes of the | ||
| # remotes which may be present even when instructed not to with --local | ||
| # flag. | ||
| # See https://bundler.io/guides/git.html#local-git-repos for details. | ||
| # (or https://github.com/rubygems/bundler-site/blob/ | ||
| # 9ff3b76e9866524ecefe165633ffb547f0004a99/source/guides/git.html.md | ||
| # if the link above ceases to exist). | ||
| if git_paths is not None: | ||
| hermetic_config += 'BUNDLE_DISABLE_LOCAL_BRANCH_CHECK: "true"\n' | ||
| hermetic_config += 'BUNDLE_DISABLE_LOCAL_REVISION_CHECK: "true"\n' | ||
| for packname, dirname in git_paths: | ||
| # "-" in variable names is deprecated in Bundler and now generates | ||
| # a warning and a suggestion to replace all dashes with triple | ||
| # underscores. Package names sometimes contain dashes: | ||
| varname = "BUNDLE_LOCAL." + packname.upper().replace("-", "___") | ||
| location = "${output_dir}/deps/bundler/" + dirname | ||
| config_entry = varname + f': "{location}"' | ||
| hermetic_config += f"{config_entry}\n" | ||
| if potential_bundle_config.is_file(): | ||
| config_data = potential_bundle_config.read_text() | ||
| config_data += hermetic_config | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message:
This is all true, undisputed, but I'd like to see more context in the message for bookkeeping reasons, i.e. this commit deletes the BUNDLE_LOCAL_* variables and converts to bare git clones but doesn't discuss why that is okay to do, especially since newer bundler would likely complain. Therefore, I'd like to see context expansion saying something along the lines of the following: