Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement query to get diff for Web Vitals from using script[type="speculationrules"] #112

Merged
merged 5 commits into from
Apr 12, 2024

Conversation

felixarntz
Copy link
Collaborator

@felixarntz felixarntz commented Apr 9, 2024

This query compares the Web Vitals passing rate per origin, for only origins that are using the Speculation Rules API (via script[type="speculationrules"]), but were not using it in the month before. The idea is that this should give a glimpse of potential impact.

Note that the dataset is still relatively small. This may be better in the future as more WordPress sites adopt the Speculative Loading plugin.

Worth noting that the query is not limited to WordPress sites, as that in principle gives us a larger dataset, and the goal of this query is not to make a WordPress-specific, but rather a general assessment on the API and its usage.

device oldDate newDate num_origins percentile lcp_diff inp_diff cls_diff fcp_diff ttfb_diff fid_diff
desktop 2024-02-01 2024-03-01 4312 10 -4.53% -3.16% -2.94% -2.65% -3.72% -2.24%
desktop 2024-02-01 2024-03-01 4312 25 -2.09% -1.40% -0.10% -1.00% -1.30% -1.22%
desktop 2024-02-01 2024-03-01 4312 50 0.21% -0.09% 2.66% 0.45% 0.96% -0.40%
desktop 2024-02-01 2024-03-01 4312 75 2.91% 1.25% 5.74% 2.23% 4.27% 0.37%
desktop 2024-02-01 2024-03-01 4312 90 5.97% 2.89% 9.34% 4.48% 8.30% 1.25%
desktop 2024-02-01 2024-03-01 4312 100 37.60% 17.48% 35.39% 46.68% 37.53% 10.16%
phone 2024-02-01 2024-03-01 595 10 -4.47% -5.99% -3.79% -2.70% -4.53% -4.31%
phone 2024-02-01 2024-03-01 595 25 -1.54% -3.79% -1.75% -0.42% -0.94% -2.83%
phone 2024-02-01 2024-03-01 595 50 0.55% -1.69% -0.40% 0.90% 0.89% -1.62%
phone 2024-02-01 2024-03-01 595 75 3.01% 1.05% 1.22% 2.80% 4.04% -0.11%
phone 2024-02-01 2024-03-01 595 90 7.64% 4.37% 3.27% 6.47% 9.51% 1.24%
phone 2024-02-01 2024-03-01 595 100 41.10% 23.08% 35.50% 44.17% 50.55% 13.10%

Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to have two variables at the top:

# UPDATE THIS EACH MONTH
DECLARE DATE_TO_QUERY DATE DEFAULT '2024-02-01';

DECLARE DATE_TO_COMPARE DATE DEFAULT DATE_SUB(DATE_TO_QUERY, INTERVAL 1 MONTH);
....
WHERE
  date = DATE_TO_QUERY
...
WHERE
  date = DATE_TO_COMPARE
...

@felixarntz
Copy link
Collaborator Author

@tunetheweb Great idea, updated in 54d2fe4.

I also updated the date and result in the description now that CrUX March data is available. While the dataset is much larger now than before, the data so far doesn't seem to show a notable win. Potentially it's still too small of a dataset to alleviate the influence from other factors on the metrics like LCP and INP.

Copy link
Collaborator

@joemcgill joemcgill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

Copy link
Member

@tunetheweb tunetheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one nit.

Shame we don't see more of a difference yet but as you say could be a very small sample set so lost in the noise.

APPROX_QUANTILES(newCrux.inp_pass_rate - oldCrux.inp_pass_rate, 100)[OFFSET(percentile)] AS inp_diff,
APPROX_QUANTILES(newCrux.cls_pass_rate - oldCrux.cls_pass_rate, 100)[OFFSET(percentile)] AS cls_diff,
APPROX_QUANTILES(newCrux.fcp_pass_rate - oldCrux.fcp_pass_rate, 100)[OFFSET(percentile)] AS fcp_diff,
APPROX_QUANTILES(newCrux.ttfb_pass_rate - oldCrux.ttfb_pass_rate, 100)[OFFSET(percentile)] AS ttfb_diff,
Copy link
Member

@tunetheweb tunetheweb Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
APPROX_QUANTILES(newCrux.ttfb_pass_rate - oldCrux.ttfb_pass_rate, 100)[OFFSET(percentile)] AS ttfb_diff,
-- Commenting out TTFB as does not include prerendered pages so kinda pointless for this analysis
-- https://developer.chrome.com/docs/crux/methodology/metrics#ttfb-metric
--APPROX_QUANTILES(newCrux.ttfb_pass_rate - oldCrux.ttfb_pass_rate, 100)[OFFSET(percentile)] AS ttfb_diff,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving it for completeness, but of course your comment makes sense.

@felixarntz felixarntz merged commit 3363f86 into main Apr 12, 2024
3 checks passed
@adamsilverstein
Copy link
Collaborator

Thanks for adding this @felixarntz. Since we don't know when during a month the site started using the feature I wonder if it makes more sense to compare the month before they adopted it to the month after the adopted it (vs the month of adoption). Running the query that may might show more of the impact because it would wlays include a full month of the site using the feature.

@felixarntz
Copy link
Collaborator Author

@adamsilverstein Sorry, it looks like I merged this just before you posted your comment.

Regarding your feedback, that sounds like an interesting idea. Just to clarify, you mean for example to use sites that:

  • didn't use the feature in 2024-02
  • use the feature in 2024-03 and 2024-04
  • compare the data from 2024-02 with 2024-04 so we know for sure the data is based on not using the feature vs using the feature

I think that would have the great benefit of ensuring the comparison is clean in terms of whether the feature is used. Of course it would be over a broader time span, so the tradeoff is that more other changes could have happened in that time. But worth trying.

One additional caveat: Based on the above example, some sites may already have used the feature towards the end of 2024-02, after the HTTP Archive pipeline ran. In that case, the CrUX data for 2024-02 would also be based on a mix of not using vs using the feature. So to make it completely "clean" from that perspective we would need to do it over 4 months. For example:

  • consider sites that didn't use the feature in 2024-01 and 2024-02 but used it in 2024-03 and 2024-04
  • compare the data from 2024-01 and 2024-04

Of course that's still not a guarantee, we can't get that. But assuming that sites activate the feature at one point and then leave it active is IMO reasonable for the vast majority of sites, so maybe this approach works.

@tunetheweb @westonruter Curious what you think about this idea. We could try it in a separate query in a follow up PR.

@tunetheweb
Copy link
Member

I think it's a great idea! CrUX data is over the whole month and it takes time for performance improvements to show in that due to the whole p75 thing as well.

@westonruter
Copy link
Collaborator

My only concern is the longer the time span is between comparisons, the more likely there is that other things (other than the newly active feature) will impact performance. So it may be more clean, but it may also be more fuzzy at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants