Introduce emulation of CPU throttling to benchmark-web-vitals #59

westonruter · 2023-07-25T18:26:07Z

Introduces a new -t/--throttle-cpu option to the benchmark-web-vitals command that allows you to supply a factor for emulating CPU throttling. This factor argument is passed to Puppeteer's Page.emulateCPUThrottling() API.

Cheers to @swissspidy for alerting me to the Puppeteer API for this.

This PR also improves parsing of command line options into parameters, enduring strict typing.

I was surprised to find that increasing the throttling results in a reduction in the savings of LCP-TTFB in v6.3-RC2 vs v6.2.

WordPress 6.2 vs 6.3-RC2 without CPU Throttling

WordPress 6.2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10018/ -n 10
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10018/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 108.75                  ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 108.75                  ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 36.25                   ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 72.7                    ║
╚═══════════════════╧═════════════════════════╝

WordPress 6.3-RC2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10023/ -n 10
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10023/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 71.9                    ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 71.9                    ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 33.9                    ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 37.45                   ║
╚═══════════════════╧═════════════════════════╝

👉 48% reduction in LCP-TTFB

WordPress 6.2 vs 6.3-RC2 with 4x CPU Throttling

WordPress 6.2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10018/ -n 10 -t 4
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10018/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 258.3                   ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 258.3                   ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 48.25                   ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 213.65                  ║
╚═══════════════════╧═════════════════════════╝

WordPress 6.3-RC2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10023/ -n 10 -t 4
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10023/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 190.25                  ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 190.25                  ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 39.85                   ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 123.6                   ║
╚═══════════════════╧═════════════════════════╝

👉 42% reduction in LCP-TTFB

WordPress 6.2 vs 6.3-RC2 with 8x CPU Throttling

WordPress 6.2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10018/ -n 10 -t 8
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10018/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 413.7                   ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 425.7                   ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 61.6                    ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 371.15                  ║
╚═══════════════════╧═════════════════════════╝

WordPress 6.3-RC2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10023/ -n 10 -t 8
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10023/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 326.4                   ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 326.4                   ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 65.35                   ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 272.65                  ║
╚═══════════════════╧═════════════════════════╝

👉 27% reduction in LCP-TTFB

cli/commands/benchmark-web-vitals.mjs

felixarntz

Thanks @westonruter, code-wise this looks good to me, I also like the idea of a dedicated function to handle the arguments.

The numbers you gathered above are really interesting, it suggests the relative improvement gets lower the "slower" the CPU is? I think it would be great to expand on that in a post or doc or even dedicated GitHub issue, e.g. also summarizing how CPU throttling affects the other metrics, like LCP and TTFB individually. And then also what do the absolute differences look like? Is that more consistent between different throttling "levels"?

Interestingly, our recent 6.3 RC1 benchmarking results from different contributors show something different, where e.g. my benchmarks, which overall were slower than those of @swissspidy, still showed a greater relative improvement. Of course that may be influenced by lots of other factors, so it can't necessarily be taken as "source of truth".

Code-wise the PR is good to go, but it would be useful to write up a broader summary of this analysis so that we can share it as a reference point on the effects of how slower / throttled CPU can affect the benchmarks that we are running.

westonruter · 2023-07-28T22:03:49Z

The numbers you gathered above are really interesting, it suggests the relative improvement gets lower the "slower" the CPU is? I think it would be great to expand on that in a post or doc or even dedicated GitHub issue, e.g. also summarizing how CPU throttling affects the other metrics, like LCP and TTFB individually. And then also what do the absolute differences look like? Is that more consistent between different throttling "levels"?

I've re-checked the results after updating to the latest Puppeteer and enabling the new headless mode from #61, and I got results that are more expected.

I made a wrapper script that facilitates gathering the results, including a 5-second sleep between calls to benchmark-web-vitals to give the CPU a chance to catch its breath:

grab-throttle-metrics.php

<?php

$number = 10;
$additional_factors = [ 4, 6 ];

$urls = [
	'WP 6.2' => 'http://localhost:10018/',
	'WP 6.3' => 'http://localhost:10023/',
];

$results = [];
foreach ( $urls as $label => $url ) {
	$results[ $label ] = [];

	foreach ( [1, ...$additional_factors] as $factor ) {
		$cmd = sprintf(
			'npm run research -- benchmark-web-vitals --url %s -n %s -o csv -t %s',
			escapeshellarg( $url ),
			escapeshellarg( $number ),
			escapeshellarg( $factor )
		);
		$output = shell_exec( $cmd );

		if ( ! preg_match( '/LCP-TTFB \(median\),(.+)/', $output, $matches ) ) {
			echo "Missing LCP-TTFB in output: $output\nCommand: $cmd\n";
			exit( 1 );
		}

		$lcp = floatval( $matches[1] );

		$results[ $label ][ $factor ] = $lcp;

		fwrite( STDERR, "$cmd: $lcp\n" );

		sleep( 5 );
	}
}

$columns = [
	'Site',
	'1x LCP',
];

foreach ( $additional_factors as $additional_factor ) {
	$columns[] = "{$additional_factor}x LCP";
	$columns[] = "{$additional_factor}x Diff";
}

print join( ',', $columns ) . PHP_EOL;

foreach ( $results as $label => $metrics ) {
	echo "$label";

	$baseline_metric = $metrics[1];
	unset( $metrics[1] );
	printf( ',%.01f', round( $baseline_metric, 1 ) );

	foreach ( $metrics as $metric ) {
		printf( ',%.01f', round( $metric, 1 ) );
		printf( ',%.1f%%', round( ( $metric / $baseline_metric ) * 100, 1 ) );
	}

	echo PHP_EOL;
}

Here are the results for 1x, 4x, and 6x throttling, with the LCP metric being the median of 10 requests, where the diff column compares the throttled LCP with the baseline LCP for that site:

Site	1x LCP	4x LCP	4x Diff	6x LCP	6x Diff
WP 6.2	54.0	202.5	375.0%	301.3	557.9%
WP 6.3	37.2	116.8	314.3%	199.4	536.7%

Introduce emulation of CPU throttling

27f2b2a

westonruter requested a review from felixarntz July 25, 2023 18:26

swissspidy reviewed Jul 26, 2023

View reviewed changes

cli/commands/benchmark-web-vitals.mjs Outdated Show resolved Hide resolved

felixarntz approved these changes Jul 26, 2023

View reviewed changes

Remove soon-to-be invalid comment regarding ClickOptions

a58609b

swissspidy approved these changes Jul 28, 2023

View reviewed changes

swissspidy merged commit c46b377 into main Jul 28, 2023

westonruter deleted the add/throttle-parameter branch July 28, 2023 22:17

westonruter mentioned this pull request Sep 11, 2023

Performance analysis using an old low-powered Android phone WordPress/performance#785

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce emulation of CPU throttling to benchmark-web-vitals #59

Introduce emulation of CPU throttling to benchmark-web-vitals #59

Uh oh!

westonruter commented Jul 25, 2023

Uh oh!

Uh oh!

felixarntz left a comment

Uh oh!

westonruter commented Jul 28, 2023

Uh oh!

Uh oh!

Introduce emulation of CPU throttling to benchmark-web-vitals #59

Introduce emulation of CPU throttling to benchmark-web-vitals #59

Uh oh!

Conversation

westonruter commented Jul 25, 2023

WordPress 6.2 vs 6.3-RC2 without CPU Throttling

WordPress 6.2 vs 6.3-RC2 with 4x CPU Throttling

WordPress 6.2 vs 6.3-RC2 with 8x CPU Throttling

Uh oh!

Uh oh!

felixarntz left a comment

Choose a reason for hiding this comment

Uh oh!

westonruter commented Jul 28, 2023

Uh oh!

Uh oh!