2023 Standards_and_Web_Performance

Standards and Web Performance

Date: 6th Jun 2023
Issue: https://github.com/Igalia/webengineshackfest/issues/8

Dan: what specs are you working on that have performance cost or specifications?

Rego: css-containment, content visibility property. Martin and Oriol worked on gecko.

Oriol: Content visibility allows you to defer laying out elements until they load. It can make an noticeable difference.

Dan: does that element have to be fixed size?

Oriol: it doesn't have to a fix size. The solution when you have content-visibility set to auto, or, if an element is laid out its size is remembered. Fixes problem with size contentment. Initially you might experience the browsers moving things erratically, but eventually it stops.

Dan: A common question for new specs related to performance: can you do this another way?

Oriol: you can use javascript, and have your own logic.

Dan: why couldn't default algorithm become default auto.

Oriol: This effects the scroller. It wouldn't be web compatible to do it by default.

Dan: What metrics do you use, how do you decide to prioritize it?

Oriol: Brian Kardell did tests with the single-page HTML specification, it loaded much faster.

Dan: How was the group consensus? How did the CSS Working Group assess the situation and decided it make sense?

Oriol: I didn't follow that, it was fleshed out when I started working on it. But IIRC it came from Google.

Dan: Any best practices coming out of this project, to recommend to everybody?

Oriol: specifically about content visibility, it can be something you can try, if it improves, you can consider using it -- otherwise it might not be working. more general in CSS, the language tries to avoid patterns that would be very problematic from a performace point of view, javascript creates much more performance problems than CSS. optimizing CSS is maybe not much to do.

Anne: In the design of Fetch, we decided that you can only consume a response as a stream once, to preserve memory, and you would have to clone the response otherwise. This has been contentious with JS developers, but if devs need it they can clone the Response, and that saves memory. Fetch was designed as a low-level primitive, that could handle GB or TB of data, which couldn't be the case otherwise.

Dan: This was a design-level decision, not something informed by benchmarking. How was the decision made?

Anne: There was discussion, could you call .json() and then call .text()? My opinion was, that should not be possible. There was some discussion, but there was definitely more when devs discovered the limitation. Some Chromium people also didn't appreciate the benefits. In the design phase there wasn't a lot of controversy, but the decision required explaining why.

Dan: Sometimes when working on things that would have performance benefits, there's discussion on how to assess things. Javacript temporal makes it so you don't need to include a library, but then, it also ships more by default in the browser.

Anne: For Intl, if you wanted a privacy-preserving implementation, you need to ship all of it.

Dan: the existenc of Intl slows down browser updates by adding a bunch of space. Intl adds a lot of data (internationalization data). this issue of privacy is that we want to decrease the number of finger printing information that is available, for example, if you have intl support for this language.

Anne: because it is a global cache, if you know this particular language is used by this website, then you know if the user has gone to that website because they have that language downloaded. If it is downloaded at time of use.

Dan: What is the risk?

Anne: they can detect if it is cached, its a side channel. The proposed design for intl is to not ship all of ICU, just as needed, but that introduces a side channel.

Dan: So maybe performance could be in contrast to privacy. How should we weigh those trade-offs?

Dietrich: With a few billion people user base, it's hard to have a clear winning solution to such a question. How much in conflict is that download size example? At Mozilla we found that the download size of the binary we saw a decrease in first runs. But that is in conflict with privacy. You have those trade-offs, and the privacy implication of those decisions aren't regulated.

Anne: There are regulations, but they affect websites, not browser vendors. At standards, we try to have technical solutions that also affect websites that don't comply

Dietrich: But the size of the download affects whether users will get the browser!

Ivan: Can you shed some light onto bundles? Isn't it discontinued? Privacy concerns with not being able to distinguish resources by URLs, making it so browsers can't choose to not load part of a bundle.

Dan: Not discontinued. I had a proposal for that but ended up not happening (?)

Dan: There's subresource loading through bundles (https://wicg.github.io/webpackage/subresource-loading.html), which has a bundle at a URL which "spoof" some URLs elsewhere. Ad blockers block specific URLs, and bundles mess with that. Then there's UUID that becomes a "fake" origin. ...

Anne: there is a new proposal with opaque origins

Dan: there is something with iframes with opaque origins

Anne: fenced frames?

Dan: no. It lets you have a bundle with multiple responses for different urls. A bundle is an atomic unit for fetching. Igalia developed, with brave and chrome, instead of loading hte whole thing atomically, you can load the ones that are not in the cache. Or that were not blocked by the ad blocker. It is saved to the same persistent HTTP cache. The hope is that this would allow for compression of the set of things. "bundle-preloading" (https://github.com/WICG/bundle-preloading/blob/main/overview.md). Where you have 200 or thousand resources, and you need to reduce to 50, this could allow fine-grained caching. Overall, there is a lack of cross browser consensus. No one is super interested in implementing it. Part of it is that we don't have proper benchmarks of this. How should this effort be prioritized?

Anne: there is a new compression thing, a standalone thing... compression dictionary transport (https://github.com/WICG/compression-dictionary-transport).

Dan: did that effort restart?

Anne: yeah its ongoing

Dan: cool

Ivan: What is the status of this?

Anne: Proposal phase still.

Dan: WICG means multiple parties in the community expressed interest, but no cross-browser interest yet.

Dietrich: What's the state of public performance measurement in the web? Which tool to use to convince other stakeholders that something is better? For functionality, there's WPT, but for performance?

Dan: JetStream can be used for Javascript to show improvements. Oriol mentioned rendering the HTML spec. For the latest kernel, people used compiling the kernel as a benchmark. I think many decisions are not made based on benchmarks.

Anne: Some is based on theory, some is based on trying to implement and seeing if it's awful. Sometimes numbers are produced or asked for. For text encoding APIs, a pushback to a proposal was that a buffer and offset was not correct for JS, a view should be used instead. There were perf concerns, but no one gave numbers, and we ended up going with the view.

Dan: This happens often, and that is the default, absent numbers.

Anne: We ended up with the better API. That's better in the absence of numbers, an API consistent with the platform. Unless the platform API is objectively bad.

Dan: Nic do you want to talk about deferred ES module evaluation? (https://github.com/tc39/proposal-defer-import-eval)

Nicolò: In other module systems for JS such as CommonJS, you can move the require call into functions, but ES modules doesn't support this because loading is async in the web. We can't do this exactly, but we can defer evaluation of modules. We load everything ahead of time but evaluate synchronously. Not as good as CommonJS, but Mozilla benchmarks show even just deferring evaluation has perf advantages. The proposal adds syntax to the existing import statement, you get a namespace object with as *, and when you read from it, it triggers evaluation of the module.

Dan: We worked the cases so it works seamlessly with module importing. This seems to be a case where there's a gap between the perf intuitions of web devs and browser engineers. Somehow there's different implicit definitions of performance for browser engineers. How should we evaluate and prioritize these things?

Anne: What ends up being deferred?

Dan: In the basic case, the whole module graph is fetched, but the statements in the modules are not evaluated.

Anne: You parse as well when fetching, you just don't evaluate?

Dan: Turns out half of the time it's parsing, half it's running, in various large programs. The semantics are defined so it's fine to parse lazily if you have already checked... It's consistent with how Node and bundlers work...

Anne: Do you need to do a full parse first?

Nicolò: You need a full parse to find further imports

Dan: And top-level await will evaluate eagerly. But no need to do a full parse into an AST

Nicolò: If a subtree of the module graph has top-level await, it will be evaluated eagerly.

Dan: Maybe those aspects are somewhat counterintuitive, but they were independently reinvented a few times.

(I missed Anne's questions about performance and the repsonses)

Dan: Bloomberg has some internal benchmarks. Overall we want to be aligned with the web platform semantics. Not great to have internal semantics.

Anne: What about syntax errors at the end of the file? I guess you need a preparse to get the early errors. Could the final parse happen at the same time as the execution?

Dan: That's what happens. JS is never executed in streaming mode. Also, the way JS engines analyze scopes, it would inhibit some scope optimizations.

Anne: This seems reasonable, especially since you spread the cost, it would result in faster first loads.

Dan: That's the hope.

Dan: There's AsyncContext (https://github.com/tc39/proposal-async-context/), which is all about helping you pass stuff across an flow of control in async functions. We'll have AsyncContext.Variable, where you can make a new variable and set it to a value while executing a callback. But if you have async/await or setTimeout, the continuation will restore the same variable. And AsyncContext.Snapshot will store the current value of all variables and you can then restore it at a later time.

Dan: For assessing how long an operation takes, you want to store the time it starts and retrieve it when it ends. Comes up in OpenTelemetry. It can also be used for task scheduling (https://github.com/wicg/scheduling-apis/), where you have different priorities, and the priority can be inherited in the context. You normally set the priority on the thread, but here you have async context, and you want to set the priority for an async function and everything it calls until it finish, the same continuation.

Dan: Chengzhong at Alibaba was one of the champions of this proposal, and he brought it to the Web Performance Working Group, people received it well. Thoughts?

Sergey: Can we implement something like this by ourselves, like Node.js's AsyncLocalStorage, with async hooks?

Dan: Async hooks are a Node.js thing, not a standard. Also, the Node.js community wanted AsyncLocalStorage to fit into the async hooks framework...

Sergey: Our implementation in a runtime on top of a web view is not like Node.js's, it implements it in a different way.

Dan: How does that work with promises?

Sergey: We have some custom APIs in the runtime, not only the ones the webview provides, but others. We implement something like async hooks or AsyncContext. Would be great to have in the language, we use it for metrics and prioritization.

Dan: Bloomberg uses this internally to keep track of applications running asynchronously. Is that based on Webkit?

Sergey: Based on Webkit or (?) which is based on Edge/Chromium...

Dan: I'd be interested to see how you do this in iOS.

Sergey: With a webview. I can show you later.

Andreu: It'd be great to see how AsyncLocalStorage / AsyncContext is being / would be used. Would help for implementing.

Maxim(?): About socializing, what efform should be done in deprecating an old feature which has been improved? When platforms end up with more and more features, it's hard for someone new to know what to use for a specific use case. Thoughts on that?

Dan: I'm not sure if there's much performance impact for keeping old things around. Web compatibility doesn't have much margin for removing things.

Maxim(?): It seems like improvements are advertised, but the old feature is not discouraged enough.

Dan: It's always hard to know how much that is possible. I'm not sure if older things are the reason something is slow. And if no one's actively maintaining an application, it's hard to get them to update.

Anne: If we have a list of good features, we might want to have a list of bad features for performance.

Dan: One feature with effort to move people off it was synchronous XHR. It's bad for perf because it makes the tab stop.

Anne: It's a bad API, and not everyone understands why it's bad. Every once in a while the XHR repo gets an issue by a dev who doesn't understand why it's deprecated. It's just a console warning, becaues it's very hard to remove. And it's not always implemented according to the spec, there are nested event loops...

Anne: Also, there's something about people calling a getter which forces a synchronous layout. There's no list of bad things. It's hard to know when to have synchronous layout and when to get a bad result.

Emilio: You can't give a bad result.

Anne: When you call that API, the whole layout has to happen, and then the result goes to JS.

Emilio: It's the most common performance issue in websites. It's an issue where you would change the styles for every link in the page, and then read the width. That triggers an entire synchronous layout.

Dan: So adding super cool CSS features would improve performance?

Anne: Maybe we should deprecate and remove these API? ... It's not hard to do.

Dan: The MDN article for getComputedStyle doesn't mention this. It should.

Emilio: Any API that returns up to date layout values can be slow.

Anne: getBoundingClientRects(). Maybe some of the canvas APIs?

Emilio: Even canvas setters, like setting the font on a canvas.

Anne: If you invoke it in an animation frame, it should be safe.

Emilio: If you invoke it such that you do all your layout reads without doing writes, it's fine. This is one of the common perf issues you see in website perf profiles. If you set all layout writes at once and then all reads at once it would be 4 times as fast.

https://www.webengineshackfest.org/

2023 Standards_and_Web_Performance

Standards and Web Performance

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally