-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: download browsers as TAR #34033
base: main
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
packages/playwright-core/src/server/registry/oopDownloadBrowserMain.ts
Outdated
Show resolved
Hide resolved
@@ -10,6 +10,7 @@ | |||
}, | |||
"dependencies": { | |||
"extract-zip": "2.0.1", | |||
"tar-fs": "^3.0.6", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the library is popular, but its deps list seem to be excessive for what it does a little. Did we consider alternatives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, "tar" is even more...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we considered tar-fs
, tar
and writing our own. Writing our own turned out more complex than imagined, because webkit has very long path names and the format becomes tricky when that's involved. Of the three, tar-fs
seemed the most focused.
@@ -48,8 +48,8 @@ | |||
"revision": "1011", | |||
"installByDefault": true, | |||
"revisionOverrides": { | |||
"mac12": "1010", | |||
"mac12-arm64": "1010" | |||
"mac12": "1011", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats the motivation for changing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1010
doesn't have .tar.br
, and 1010
is identical to 1011
in functionality
This comment has been minimized.
This comment has been minimized.
@@ -1229,6 +2813,6 @@ END OF [email protected] AND INFORMATION | |||
|
|||
SUMMARY BEGIN HERE | |||
========================================= | |||
Total Packages: 48 | |||
Total Packages: 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are paying a 25% bump in # of deps for a feature that does not link to a user report linked. Usually not a very good sign.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. It also increases the zip bundle size from 112kb to 202kb. We had this attempt of writing our own tar parser, maybe we should give it another try.
Just as an idea, can we utilize extract-zip's non-compression mode for tar? That way we use zip for tar and don't need all this extra code? i.e. the files will be .zip.br, not .tar.br. |
That'd save some dependencies, but would result in slightly larger bundles1 and it'd prevent streaming extraction. I'd prefer to stick with TAR, gonna take a stab at reducing the bundle size for that. Footnotes
|
Alright, i've vendored That way we're using a tried & tested tar parser, but don't pay the download price. Let me know if you like that approach, and where's the best place for a vendored module to live / any license specifics we need to follow. |
This comment has been minimized.
This comment has been minimized.
if (!downloadPathTemplate) | ||
return []; | ||
// old webkit versions don't have brotli |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why only old webkit revisions don't have brotli
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
webkit is the only browser we have revisionOverrides
overrides for that point to old versions, so the CI script that created them didn't yet create brotli
} | ||
log(`SUCCESS downloading and extracting ${options.title}`); | ||
} else { | ||
await downloadFile(options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing different error handling code in this branch, including explicit checks for ECONNRESET. Is walking away from them intended? Should we do both changes at a time? I'd be more comfortable with leaving the download code as is and swapping piping into file with piping into broti.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new branch is intended to be as similar as possible, while also making the code a little more linear. The ECONNRESET
check only changed the error message, so I didn't include that.
Let me see if I can refactor it to make the change less spooky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored it so we can reuse the existing download function. Good pointer, thanks!
@@ -0,0 +1 @@ | |||
This directory contains a modified copy of the `tar-stream` library that's used exclusively to extract TAR files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure all the third party files are under the third_party folder and corresponding license files are provided beside the files. Make sure they end up in third party list or in a distributed bundle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! See diff.patch
for all my changes.
} | ||
|
||
shiftFirst (size) { | ||
return this._buffered === 0 ? null : this._next(size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a bug on 21st line of this library? (I don't see this._buffered defined)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like it! It's also in the original: https://github.com/mafintosh/tar-stream/blob/126968fd3c4a39eba5f8318c255e04cedbbad176/extract.js#L23C17-L23C26
@@ -0,0 +1,311 @@ | |||
const { Writable, Readable, getStreamError } = require('stream') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getStreamError
is not a thing. How is it supposed to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find! Removed the usage of it by moving _predestroy
into _destroy
. Once I add the diff this will make more sense.
const len = parseInt(buf.toString('ascii', 0, i), 10) | ||
if (!len) return result | ||
|
||
const b = buf.subarray('ascii', i + 1, len - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChatGPT thinks it is a bug since the value called len
is used in the subarray(start, end) signature. Given that the start is i + 1
, which points to right after the parsed len, len - 1
can't be a valid end, did they want to say i + len
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is what the original library does 🤷
https://github.com/mafintosh/tar-stream/blob/126968fd3c4a39eba5f8318c255e04cedbbad176/headers.js#L40
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be some sort of special case checking. Maybe in some implementations of TAR/PAX, len
doesn't contain a length, but an index?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Test results for "tests 1"6 failed 7 flaky37574 passed, 648 skipped Merge workflow run. |
Some of our browsers are already available as
.tar.br
. Compared to the current.zip
archives, the brotli tarballs are ~10-30% smaller. This PR makes us download brotli files for chromium and webkit.