Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce emulation of CPU throttling to benchmark-web-vitals #59

Merged
merged 2 commits into from
Jul 28, 2023

Conversation

westonruter
Copy link
Collaborator

Introduces a new -t/--throttle-cpu option to the benchmark-web-vitals command that allows you to supply a factor for emulating CPU throttling. This factor argument is passed to Puppeteer's Page.emulateCPUThrottling() API.

Cheers to @swissspidy for alerting me to the Puppeteer API for this.

This PR also improves parsing of command line options into parameters, enduring strict typing.

I was surprised to find that increasing the throttling results in a reduction in the savings of LCP-TTFB in v6.3-RC2 vs v6.2.

WordPress 6.2 vs 6.3-RC2 without CPU Throttling

WordPress 6.2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10018/ -n 10
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10018/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 108.75                  ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 108.75                  ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 36.25                   ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 72.7                    ║
╚═══════════════════╧═════════════════════════╝

WordPress 6.3-RC2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10023/ -n 10
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10023/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 71.9                    ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 71.9                    ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 33.9                    ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 37.45                   ║
╚═══════════════════╧═════════════════════════╝

👉 48% reduction in LCP-TTFB

WordPress 6.2 vs 6.3-RC2 with 4x CPU Throttling

WordPress 6.2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10018/ -n 10 -t 4
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10018/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 258.3                   ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 258.3                   ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 48.25                   ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 213.65                  ║
╚═══════════════════╧═════════════════════════╝

WordPress 6.3-RC2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10023/ -n 10 -t 4
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10023/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 190.25                  ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 190.25                  ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 39.85                   ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 123.6                   ║
╚═══════════════════╧═════════════════════════╝

👉 42% reduction in LCP-TTFB

WordPress 6.2 vs 6.3-RC2 with 8x CPU Throttling

WordPress 6.2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10018/ -n 10 -t 8
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10018/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 413.7                   ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 425.7                   ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 61.6                    ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 371.15                  ║
╚═══════════════════╧═════════════════════════╝

WordPress 6.3-RC2:

$ npm run research -- benchmark-web-vitals --url http://localhost:10023/ -n 10 -t 8
╔═══════════════════╤═════════════════════════╗
║ URL               │ http://localhost:10023/ ║
╟───────────────────┼─────────────────────────╢
║ Success Rate      │ 100%                    ║
╟───────────────────┼─────────────────────────╢
║ FCP (median)      │ 326.4                   ║
╟───────────────────┼─────────────────────────╢
║ LCP (median)      │ 326.4                   ║
╟───────────────────┼─────────────────────────╢
║ TTFB (median)     │ 65.35                   ║
╟───────────────────┼─────────────────────────╢
║ LCP-TTFB (median) │ 272.65                  ║
╚═══════════════════╧═════════════════════════╝

👉 27% reduction in LCP-TTFB

Copy link
Collaborator

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @westonruter, code-wise this looks good to me, I also like the idea of a dedicated function to handle the arguments.

The numbers you gathered above are really interesting, it suggests the relative improvement gets lower the "slower" the CPU is? I think it would be great to expand on that in a post or doc or even dedicated GitHub issue, e.g. also summarizing how CPU throttling affects the other metrics, like LCP and TTFB individually. And then also what do the absolute differences look like? Is that more consistent between different throttling "levels"?

Interestingly, our recent 6.3 RC1 benchmarking results from different contributors show something different, where e.g. my benchmarks, which overall were slower than those of @swissspidy, still showed a greater relative improvement. Of course that may be influenced by lots of other factors, so it can't necessarily be taken as "source of truth".

Code-wise the PR is good to go, but it would be useful to write up a broader summary of this analysis so that we can share it as a reference point on the effects of how slower / throttled CPU can affect the benchmarks that we are running.

@swissspidy swissspidy merged commit c46b377 into main Jul 28, 2023
2 checks passed
@westonruter
Copy link
Collaborator Author

The numbers you gathered above are really interesting, it suggests the relative improvement gets lower the "slower" the CPU is? I think it would be great to expand on that in a post or doc or even dedicated GitHub issue, e.g. also summarizing how CPU throttling affects the other metrics, like LCP and TTFB individually. And then also what do the absolute differences look like? Is that more consistent between different throttling "levels"?

I've re-checked the results after updating to the latest Puppeteer and enabling the new headless mode from #61, and I got results that are more expected.

I made a wrapper script that facilitates gathering the results, including a 5-second sleep between calls to benchmark-web-vitals to give the CPU a chance to catch its breath:

grab-throttle-metrics.php

<?php

$number = 10;
$additional_factors = [ 4, 6 ];

$urls = [
	'WP 6.2' => 'http://localhost:10018/',
	'WP 6.3' => 'http://localhost:10023/',
];

$results = [];
foreach ( $urls as $label => $url ) {
	$results[ $label ] = [];

	foreach ( [1, ...$additional_factors] as $factor ) {
		$cmd = sprintf(
			'npm run research -- benchmark-web-vitals --url %s -n %s -o csv -t %s',
			escapeshellarg( $url ),
			escapeshellarg( $number ),
			escapeshellarg( $factor )
		);
		$output = shell_exec( $cmd );

		if ( ! preg_match( '/LCP-TTFB \(median\),(.+)/', $output, $matches ) ) {
			echo "Missing LCP-TTFB in output: $output\nCommand: $cmd\n";
			exit( 1 );
		}

		$lcp = floatval( $matches[1] );

		$results[ $label ][ $factor ] = $lcp;

		fwrite( STDERR, "$cmd: $lcp\n" );

		sleep( 5 );
	}
}

$columns = [
	'Site',
	'1x LCP',
];

foreach ( $additional_factors as $additional_factor ) {
	$columns[] = "{$additional_factor}x LCP";
	$columns[] = "{$additional_factor}x Diff";
}

print join( ',', $columns ) . PHP_EOL;

foreach ( $results as $label => $metrics ) {
	echo "$label";

	$baseline_metric = $metrics[1];
	unset( $metrics[1] );
	printf( ',%.01f', round( $baseline_metric, 1 ) );

	foreach ( $metrics as $metric ) {
		printf( ',%.01f', round( $metric, 1 ) );
		printf( ',%.1f%%', round( ( $metric / $baseline_metric ) * 100, 1 ) );
	}

	echo PHP_EOL;
}

Here are the results for 1x, 4x, and 6x throttling, with the LCP metric being the median of 10 requests, where the diff column compares the throttled LCP with the baseline LCP for that site:

Site 1x LCP 4x LCP 4x Diff 6x LCP 6x Diff
WP 6.2 54.0 202.5 375.0% 301.3 557.9%
WP 6.3 37.2 116.8 314.3% 199.4 536.7%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants