Revert "vo_gpu: remove --scaler-lut-size" #13291

hegdenischay · 2024-01-16T00:14:30Z

This reverts commit 44cf628.

Fixes issue: #13273
scaler-lut-size=8 causes a performance regression on my machine, therefore it'd be a good idea to keep the default at 6 and add it to profile=high-quality

This reverts commit 44cf628.

github-actions · 2024-01-16T16:48:49Z

Download the artifacts for this pull request:

Windows

llyyr · 2024-01-16T17:17:48Z

The default should probably be 2^8 for the baseline and set to 2^6 for profile=fast instead. Based on some benchmarks done on an 11th gen Intel NUC iirc and my own 13th gen iGPU, the difference was insignificant which is why it was hardcoded to 8 in the first place. It's possible though that this matters for ancient iGPUs.

sfan5 · 2024-01-17T18:38:52Z

It'd help if you described what "your machine" looks like.

llyyr · 2024-01-17T19:01:58Z

It'd help if you described what "your machine" looks like.

From the linked issue:

GPU: Intel HD Graphics 4400 (Intel i5-4300U CPU)

sfan5 · 2024-01-18T10:57:25Z

That's not totally ancient so I think bringing back the option makes sense.

haasn · 2024-01-18T15:23:03Z

The default should probably be 2^8 for the baseline and set to 2^6 for profile=fast instead. Based on some benchmarks done on an 11th gen Intel NUC iirc and my own 13th gen iGPU, the difference was insignificant which is why it was hardcoded to 8 in the first place. It's possible though that this matters for ancient iGPUs.

+1

kasper93 · 2024-01-18T20:56:36Z

and set to 2^6 for profile=fast instead

This doesn't make sense. profile=fast does not use any luts in part to avoid any ancient iGPU slowness. It is meant to be "dumb" by design. It uses direct sampling.

llyyr · 2024-01-18T21:07:09Z

This doesn't make sense. profile=fast does not use any luts in part to avoid any ancient iGPU slowness. It is meant to be "dumb" by design. It uses direct sampling.

Yeah, sorry I forgot about that. Just bringing it back and defaulting to 8 is good enough then.

kasper93 · 2024-01-18T21:18:01Z

keep the default at 6 and add it to profile=high-quality
but it only happens with scale=ewa_lanczossharp. cscale=ewa_lanczossharp

Since we are talking about polar scalers here, I wonder if it is related to https://code.videolan.org/videolan/libplacebo/-/issues/310 on older Intel iGPU noncompute version can be over 3 times as fast, and on your gpu the difference may be even higher.

@hegdenischay: would you mind doing some benchmarks, just so we have whole picture?

sfan5 · 2024-01-18T22:27:29Z

his doesn't make sense. profile=fast does not use any luts in part to avoid any ancient iGPU slowness

It's not far-fetched that someone would start with profile=fast and enable more demanding options as needed. I think it still makes sense.

kasper93 · 2024-01-18T23:02:49Z

It's not far-fetched that someone would start with profile=fast and enable more demanding options as needed. I think it still makes sense.

It is not, but in the same time you can say the same about many other options and adding dummy options to profile only bloat the profile itself. Also this is related to some over decade old gpu.

hegdenischay · 2024-01-22T15:14:16Z

Hi, sorry for not commenting earlier. Was kinda swamped at work. I'm mainly interested in bringing it back as an option, since I currently can't use this scaler without messing up the video. The part about adding it to profile=high-quality is just me being lazy with the commit.

I don't mind running benchmarks.

hegdenischay · 2024-01-22T15:40:30Z

The relevant parts (vulkaninfo says my GPU is a HSW GT2):

'polar':	 43 frames in 1.383055 seconds => 32.164070 ms/frame (31.09 FPS), gpu time: 23.259074 ms
'polar_nocompute':	 33 frames in 1.520992 seconds => 46.090654 ms/frame (21.70 FPS), gpu time: 30.763520 ms

I have also attached a full log:
testlog.txt

I'm not sure why polar_nocompute is slower, essentially the opposite of what was seen in that issue.

kasper93 · 2024-01-22T15:52:16Z

Interesting, makes it little bit more complex to handle if we were to "smartly" enable compute shader.

Just out of curiosity, what is the performance on this benchmark when you reduce the size. (can also try 128 for completeness)

diff --git a/src/shaders/sampling.c b/src/shaders/sampling.c
index fc10f80b..e754ea00 100644
--- a/src/shaders/sampling.c
+++ b/src/shaders/sampling.c
@@ -559,7 +559,7 @@ struct sh_sampler_obj {
     pl_shader_obj pass2; // for pl_shader_sample_ortho
 };
 
-#define SCALER_LUT_SIZE     256
+#define SCALER_LUT_SIZE     64
 #define SCALER_LUT_CUTOFF   1e-3f
 
 static void sh_sampler_uninit(pl_gpu gpu, void *ptr)

hegdenischay · 2024-01-22T16:39:57Z

with SCALER_LUT_SIZE = 64:

'polar':	 42 frames in 1.383859 seconds => 32.949020 ms/frame (30.35 FPS), gpu time: 23.302960 ms
'polar_nocompute':	 34 frames in 1.513501 seconds => 44.514732 ms/frame (22.46 FPS), gpu time: 30.203909 ms

with SCALER_LUT_SIZE = 128:

'polar':	 43 frames in 1.400678 seconds => 32.573914 ms/frame (30.70 FPS), gpu time: 23.413177 ms
'polar_nocompute':	 34 frames in 1.477449 seconds => 43.454386 ms/frame (23.01 FPS), gpu time: 29.327269 ms

Logs:
testlog-lut-size-128.txt
testlog-lut-size-64.txt

Revert "vo_gpu: remove --scaler-lut-size"

b90ad67

This reverts commit 44cf628.

hegdenischay marked this pull request as ready for review January 16, 2024 00:14

sfan5 requested a review from haasn January 18, 2024 11:14

sfan5 added the priority:stalled label Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "vo_gpu: remove --scaler-lut-size" #13291

Revert "vo_gpu: remove --scaler-lut-size" #13291

hegdenischay commented Jan 16, 2024

github-actions bot commented Jan 16, 2024

llyyr commented Jan 16, 2024

sfan5 commented Jan 17, 2024

llyyr commented Jan 17, 2024

sfan5 commented Jan 18, 2024

haasn commented Jan 18, 2024

kasper93 commented Jan 18, 2024 •

edited

Loading

llyyr commented Jan 18, 2024

kasper93 commented Jan 18, 2024 •

edited

Loading

sfan5 commented Jan 18, 2024

kasper93 commented Jan 18, 2024

hegdenischay commented Jan 22, 2024

hegdenischay commented Jan 22, 2024

kasper93 commented Jan 22, 2024

hegdenischay commented Jan 22, 2024

Revert "vo_gpu: remove --scaler-lut-size" #13291

Are you sure you want to change the base?

Revert "vo_gpu: remove --scaler-lut-size" #13291

Conversation

hegdenischay commented Jan 16, 2024

github-actions bot commented Jan 16, 2024

llyyr commented Jan 16, 2024

sfan5 commented Jan 17, 2024

llyyr commented Jan 17, 2024

sfan5 commented Jan 18, 2024

haasn commented Jan 18, 2024

kasper93 commented Jan 18, 2024 • edited Loading

llyyr commented Jan 18, 2024

kasper93 commented Jan 18, 2024 • edited Loading

sfan5 commented Jan 18, 2024

kasper93 commented Jan 18, 2024

hegdenischay commented Jan 22, 2024

hegdenischay commented Jan 22, 2024

kasper93 commented Jan 22, 2024

hegdenischay commented Jan 22, 2024

kasper93 commented Jan 18, 2024 •

edited

Loading

kasper93 commented Jan 18, 2024 •

edited

Loading