Add security and privacy considerations for OCE#317
Add security and privacy considerations for OCE#317
Conversation
|
I m trying to add @mikewest as reviewer, with no success since the first concern came from him and @reillyeon. |
| We determined that the cost of exposing this additional estimate is negligible. In scenarios | ||
| where the site contributes most of the pressure (e.g., under a critical pressure state), | ||
| the key takeaway is that the rest of the system is stable—so there is limited new insight to extract. | ||
| Conversely, if the site's contribution is minimal, the site gains little beyond what the global | ||
| pressure state already reveals. |
There was a problem hiding this comment.
Could a site intentionally push the system into the "critical" state and then use "own contribution estimate" and knowledge of how much work it is doing to precisely measure the amount of work that other applications are applying?
There was a problem hiding this comment.
Reilly's concern seems reasonable to me. Exposing additional information about a site's contribution to the global state seems like it necessarily reveals information about the other things going on on the system. If I rev up some hard work on my site, it seems like I'd be able to tune it more effectively to extract information about the behavior of a window I pop up, for instance.
It's hard for me to say how much easier that makes data leakage, but it's not at all clear to me that it must be negligable.
There was a problem hiding this comment.
My own experimentation shows that it is quite hard on most systems to get it into a critical in a predictable way, especially given that most modern CPUs are designed to handle high pressure for shorter amounts of time, but not sustain that pressure for longer time.
Even if you can measure the amount of work it takes to get into critical pressure and how long it takes, even when it is caused by external pressure, you have no way to know whether that is caused by another site, other externals apps, system apps or a combination of multiple of these examples - so it is very hard, if not impossible, to attribute the pressure to any particular site or application. Also, at the end of the day we only have 4 states, so you can only attribute the work to these states.
When I was working on the mitigations, I tried exactly this by trying to calibrate workloads to control these states but ended giving up as I had to make sure that absolutely no other app or background process was running on the system, or it would miserably fail. Also, I ran into the issue that a CPU might be able to handle a workload at times (it works hard to get work out of the way with boosts, etc.), but not sustainable over longer periods - so that made calibration quite hard. I also only made this kind of work (manual calibration) on one older system running basically only the browser, but not on any of my other machines.
mikewest
left a comment
There was a problem hiding this comment.
Thanks for looping me in, happy to add some thoughts:
| Do A/B testing in live deployments with different code paths to determine what results in the | ||
| lowest pressure on device types. | ||
| </li> | ||
| </ul> |
There was a problem hiding this comment.
The second and third of these options seem entirely reasonable to expect developers to do in the absence of the additional information that the OCE mechanism provides. That is, if an applications' users are under compute pressure, it seems pretty reasonable for the application to either optimize its code or back off. Likewise, A/B testing in realistic environments requires dealing with those environments as they are. This seems true even in the case where other applications on the device are responsible for the bottleneck. There's only so much CPU, and the application has to work with what it has at the end of the day.
The additional information provided by OCE would allow the first option to be narrowly targeted to the cases in which it might help. Personally and subjectively, I'm not super enthusiastic about websites telling me to close other applications (nor, really, would I be happy about other applications telling me to close websites). Are there other use cases this enables?
There was a problem hiding this comment.
Talking to Whereby, I heard it was quite common for doctors using their service to tell their clients to make sure to close all other apps. I think I have even seen a toast suggesting something similar in another service (not sure whether it was Jit.si or Meet) so that already happens today.
Whereby told us that they store the information for later debugging purposes
| <p> | ||
| The platforms exposing the [=own contribution estimate=] will fire events more frequently | ||
| as the estimate may change independent of the global pressure state, but as these are not | ||
| separate events they are also subject to the rate obfuscation. |
There was a problem hiding this comment.
I'm not sure I follow. If we're firing more events with more detail more often, how do the properties of the rate obfuscation remain the same?
There was a problem hiding this comment.
You do get more events, but the rate can still be obfuscated in the same manner
| <p> | ||
| <em>If the majority of the pressure comes from external sources</em>, the site may choose to simplify | ||
| its features to maintain a smooth experience. It may also prompt users to close other | ||
| applications to reduce overall system pressure. |
There was a problem hiding this comment.
This doesn't seem unique to pressure from external sources. Surely the site would simplify its features if it itself was exceeding the capabilities of the device, since even an admonition to close other applications would subject the user to poor performance up until the point they broke down and started looking at what else they might have open?
There was a problem hiding this comment.
Sure, that is right, can probably be reworded. But prompting to close other apps and sites for a smoother experience is not relevant if caused by the site itself even after turning off user facing features.
| JavaScript) to WebGPU—in an effort to enhance performance. | ||
| </p> | ||
| <p> | ||
| We determined that the cost of exposing this additional estimate is negligible. In scenarios |
There was a problem hiding this comment.
Was this determination experimental? If so, it'd be helpful to understand how you analyzed the potential for data leakage.
There was a problem hiding this comment.
This is just from my own experiments with trying to see if I could use the data reliable.
| We determined that the cost of exposing this additional estimate is negligible. In scenarios | ||
| where the site contributes most of the pressure (e.g., under a critical pressure state), | ||
| the key takeaway is that the rest of the system is stable—so there is limited new insight to extract. | ||
| Conversely, if the site's contribution is minimal, the site gains little beyond what the global | ||
| pressure state already reveals. |
There was a problem hiding this comment.
Reilly's concern seems reasonable to me. Exposing additional information about a site's contribution to the global state seems like it necessarily reveals information about the other things going on on the system. If I rev up some hard work on my site, it seems like I'd be able to tune it more effectively to extract information about the behavior of a window I pop up, for instance.
It's hard for me to say how much easier that makes data leakage, but it's not at all clear to me that it must be negligable.
Preview | Diff