Skip to content

vello_hybrid: add native WebGL backend #1011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 20, 2025

Conversation

ajakubowicz-canva
Copy link
Contributor

@ajakubowicz-canva ajakubowicz-canva commented May 15, 2025

Context

This PR follows the conversation had about #947 . I made this PR separately as it also incorporates the clipping changes #957 .

In short, this PR adds a native WebGL backend when targeting wasm32 and if using the "webgl" feature on vello_hybrid.
The primary motivation of using a custom webgl renderer is binary size, allowing 3mb to be removed when targeting WebGL2 natively. This is achieved by omitting wgpu from the binary when the architecture is wasm32 and the "webgl" feature flag is set on vello_hybrid.

Changes

vello_hybrid examples

  • The webgl example has been renamed to wgpu_webgl. Now it's more clear that it leverages wgpu's WebGL backend.
  • A native_webgl example has been added which uses the new WebGL renderer backend.
  • ci.yml tests both the wgpu_webgl example and the native_webgl example - smoke testing both webgl techniques.
  • A new ClipScene has been added for manually viewing and testing deeply nested clipping. (file)

The PR can be manually tested by locally pulling the branch and running the two examples:

  • cargo run_wasm -p wgpu_webgl --release: Test original example
  • cargo run_wasm -p native_webgl --release: Test new backend

New vello_sparse_shaders package added

This new package contains the WGSL shaders as a source of truth. vello_hybrid optionally depends on this library which triggers a build step generating a compiled module. The module contains GLSL shader source code, as well as mappings from the WGSL identifiers to the naga-mangled identifiers in the GLSL.

The generated code:
// Generated code by `vello_sparse_shaders` - DO NOT EDIT
/// Build time GLSL shaders derived from wgsl shaders.
/// Compiled glsl for `clear_slots.wgsl`
pub mod clear_slots {
    #![allow(missing_docs, reason="No metadata to generate precise documentation forgenerated code.")]

    pub const VERTEX_SOURCE: &str = r###"#version 300 es

precision highp float;
precision highp int;

struct Config {
    uint slot_width;
    uint slot_height;
    uint texture_height;
    uint _padding;
};
uniform Config_block_0Vertex { Config _group_0_binding_0_vs; };

layout(location = 0) in uint _p2vs_location0;

void main() {
    uint vertex_index = uint(gl_VertexID);
    uint index = _p2vs_location0;
    float x = float((vertex_index & 1u));
    float y = float((vertex_index >> 1u));
    uint _e10 = _group_0_binding_0_vs.slot_height;
    float slot_y_offset = float((index * _e10));
    uint _e15 = _group_0_binding_0_vs.slot_width;
    float pix_x = (x * float(_e15));
    uint _e20 = _group_0_binding_0_vs.slot_height;
    float pix_y = (slot_y_offset + (y * float(_e20)));
    uint _e28 = _group_0_binding_0_vs.slot_width;
    float ndc_x = (((pix_x * 2.0) / float(_e28)) - 1.0);
    uint _e37 = _group_0_binding_0_vs.texture_height;
    float ndc_y = (1.0 - ((pix_y * 2.0) / float(_e37)));
    gl_Position = vec4(ndc_x, ndc_y, 0.0, 1.0);
    gl_Position.yz = vec2(-gl_Position.y, gl_Position.z * 2.0 - gl_Position.w);
    return;
}

"###;

    pub mod vertex {
        pub const CONFIG: &str = "Config_block_0Vertex";
    }
    pub const FRAGMENT_SOURCE: &str = r###"#version 300 es

precision highp float;
precision highp int;

struct Config {
    uint slot_width;
    uint slot_height;
    uint texture_height;
    uint _padding;
};
layout(location = 0) out vec4 _fs2p_location0;

void main() {
    vec4 position = gl_FragCoord;
    _fs2p_location0 = vec4(0.0, 0.0, 0.0, 0.0);
    return;
}

"###;
}
/// Compiled glsl for `render_strips.wgsl`
pub mod render_strips {
    #![allow(missing_docs, reason="No metadata to generate precise documentation forgenerated code.")]

    pub const VERTEX_SOURCE: &str = r###"#version 300 es

precision highp float;
precision highp int;

struct Config {
    uint width;
    uint height;
    uint strip_height;
    uint alphas_tex_width_bits;
};
struct StripInstance {
    uint xy;
    uint widths;
    uint col;
    uint rgba_or_slot;
};
struct VertexOutput {
    vec2 tex_coord;
    uint dense_end;
    uint rgba_or_slot;
    vec4 position;
};
uniform Config_block_0Vertex { Config _group_0_binding_1_vs; };

layout(location = 0) in uint _p2vs_location0;
layout(location = 1) in uint _p2vs_location1;
layout(location = 2) in uint _p2vs_location2;
layout(location = 3) in uint _p2vs_location3;
smooth out vec2 _vs2fs_location0;
flat out uint _vs2fs_location1;
flat out uint _vs2fs_location2;

uint unpack_alphas_from_channel(uvec4 rgba, uint channel_index) {
    switch(channel_index) {
        case 0u: {
            return rgba.x;
        }
        case 1u: {
            return rgba.y;
        }
        case 2u: {
            return rgba.z;
        }
        case 3u: {
            return rgba.w;
        }
        default: {
            return rgba.x;
        }
    }
}

vec4 unpack4x8unorm(uint rgba_packed) {
    return vec4((float(((rgba_packed >> 0u) & 255u)) / 255.0), (float(((rgba_packed >> 8u) & 255u)) / 255.0), (float(((rgba_packed >> 16u) & 255u)) / 255.0), (float(((rgba_packed >> 24u) & 255u)) / 255.0));
}

void main() {
    uint in_vertex_index = uint(gl_VertexID);
    StripInstance instance = StripInstance(_p2vs_location0, _p2vs_location1, _p2vs_location2, _p2vs_location3);
    VertexOutput out_ = VertexOutput(vec2(0.0), 0u, 0u, vec4(0.0));
    float x = float((in_vertex_index & 1u));
    float y = float((in_vertex_index >> 1u));
    uint x0_ = (instance.xy & 65535u);
    uint y0_ = (instance.xy >> 16u);
    uint width = (instance.widths & 65535u);
    uint dense_width = (instance.widths >> 16u);
    out_.dense_end = (instance.col + dense_width);
    float pix_x = (float(x0_) + (float(width) * x));
    uint _e31 = _group_0_binding_1_vs.strip_height;
    float pix_y = (float(y0_) + (y * float(_e31)));
    uint _e39 = _group_0_binding_1_vs.width;
    float ndc_x = (((pix_x * 2.0) / float(_e39)) - 1.0);
    uint _e48 = _group_0_binding_1_vs.height;
    float ndc_y = (1.0 - ((pix_y * 2.0) / float(_e48)));
    out_.position = vec4(ndc_x, ndc_y, 0.0, 1.0);
    uint _e65 = _group_0_binding_1_vs.strip_height;
    out_.tex_coord = vec2((float(instance.col) + (x * float(width))), (y * float(_e65)));
    out_.rgba_or_slot = instance.rgba_or_slot;
    VertexOutput _e71 = out_;
    _vs2fs_location0 = _e71.tex_coord;
    _vs2fs_location1 = _e71.dense_end;
    _vs2fs_location2 = _e71.rgba_or_slot;
    gl_Position = _e71.position;
    gl_Position.yz = vec2(-gl_Position.y, gl_Position.z * 2.0 - gl_Position.w);
    return;
}

"###;

    pub mod vertex {
        pub const CONFIG: &str = "Config_block_0Vertex";
    }
    pub const FRAGMENT_SOURCE: &str = r###"#version 300 es

precision highp float;
precision highp int;

struct Config {
    uint width;
    uint height;
    uint strip_height;
    uint alphas_tex_width_bits;
};
struct StripInstance {
    uint xy;
    uint widths;
    uint col;
    uint rgba_or_slot;
};
struct VertexOutput {
    vec2 tex_coord;
    uint dense_end;
    uint rgba_or_slot;
    vec4 position;
};
uniform Config_block_0Fragment { Config _group_0_binding_1_fs; };

uniform highp usampler2D _group_0_binding_0_fs;

uniform highp sampler2D _group_0_binding_2_fs;

smooth in vec2 _vs2fs_location0;
flat in uint _vs2fs_location1;
flat in uint _vs2fs_location2;
layout(location = 0) out vec4 _fs2p_location0;

uint unpack_alphas_from_channel(uvec4 rgba, uint channel_index) {
    switch(channel_index) {
        case 0u: {
            return rgba.x;
        }
        case 1u: {
            return rgba.y;
        }
        case 2u: {
            return rgba.z;
        }
        case 3u: {
            return rgba.w;
        }
        default: {
            return rgba.x;
        }
    }
}

vec4 unpack4x8unorm(uint rgba_packed) {
    return vec4((float(((rgba_packed >> 0u) & 255u)) / 255.0), (float(((rgba_packed >> 8u) & 255u)) / 255.0), (float(((rgba_packed >> 16u) & 255u)) / 255.0), (float(((rgba_packed >> 24u) & 255u)) / 255.0));
}

void main() {
    VertexOutput in_ = VertexOutput(_vs2fs_location0, _vs2fs_location1, _vs2fs_location2, gl_FragCoord);
    float alpha = 1.0;
    uint alphas_index = uint(floor(in_.tex_coord.x));
    if ((alphas_index < in_.dense_end)) {
        uint y = uint(floor(in_.tex_coord.y));
        uvec2 tex_dimensions = uvec2(textureSize(_group_0_binding_0_fs, 0).xy);
        uint alphas_tex_width = tex_dimensions.x;
        uint texel_index = (alphas_index / 4u);
        uint channel_index_1 = (alphas_index % 4u);
        uint tex_x = (texel_index & (alphas_tex_width - 1u));
        uint _e25 = _group_0_binding_1_fs.alphas_tex_width_bits;
        uint tex_y = (texel_index >> _e25);
        uvec4 rgba_values = texelFetch(_group_0_binding_0_fs, ivec2(uvec2(tex_x, tex_y)), 0);
        uint _e31 = unpack_alphas_from_channel(rgba_values, channel_index_1);
        alpha = (float(((_e31 >> (y * 8u)) & 255u)) * 0.003921569);
    }
    uint alpha_byte = (in_.rgba_or_slot >> 24u);
    if ((alpha_byte != 0u)) {
        float _e45 = alpha;
        vec4 _e47 = unpack4x8unorm(in_.rgba_or_slot);
        _fs2p_location0 = (_e45 * _e47);
        return;
    } else {
        uint clip_x = (uint(in_.position.x) & 255u);
        uint _e62 = _group_0_binding_1_fs.strip_height;
        uint clip_y = ((uint(in_.position.y) & 3u) + (in_.rgba_or_slot * _e62));
        vec4 clip_in_color = texelFetch(_group_0_binding_2_fs, ivec2(uvec2(clip_x, clip_y)), 0);
        float _e69 = alpha;
        _fs2p_location0 = (_e69 * clip_in_color);
        return;
    }
}

"###;
    pub mod fragment {
        pub const CONFIG: &str = "Config_block_0Fragment";
        pub const ALPHAS_TEXTURE: &str = "_group_0_binding_0_fs";
        pub const CLIP_INPUT_TEXTURE: &str = "_group_0_binding_2_fs";
    }
}

The generated code can then be imported with:
use vello_sparse_shaders::{clear_slots, render_strips};

vello_hybrid changes

  • A new render subdirectory has been added that contains:

    • common.rs: All the shared render logic.
    • wgpu.rs: The original renderer leveraging wgpu.
    • webgl.rs: The new WebGL native backend renderer.
  • The Scheduler has been made backend-agnostic by operating on a new RendererBackend trait. Both the wgpu and webgl renderer backends implement RendererBackend.

Feature flag changes

Feature flags in vello_hybrid are additive. By default the wgpu feature is enabled. If the compile target is wasm32 and the webgl feature is enabled on vello_hybrid, then the native WebGL renderer will be enabled.

Warnings

A runtime warning has been added that will trigger once on either renderer being instantiated, if both:

  • wgpu with its WebGL backend is active.
  • The WebGlRenderer is also active.

The warning is:

Both WebGL and wgpu with the "webgl" feature are enabled.
For optimal performance and binary size on web targets, use only the dedicated WebGL renderer.

Screen recording

Note

The screen recording below is slightly stale – I've since changed the background to be dark so the white text scene can be read.

webgl_native

Left side is native_webgl example (using native WebGL2)
Right side is the existing webgl example which uses wgpu with the webgl feature flag.

Test plan

To scope down this PR, there are no automated tests for the renderer except for the single browser test introduced in the example. The shader compilation has some unit tests.

This PR was manually tested via the new native webgl example: cargo run_wasm -p native_webgl. This example can be tested against the original cargo run_wasm -p wgpu_webgl.

Risks

The only risk I'm uncertain about is the addition of the wgpu feature flag, that is used as a default feature. Could this be a breaking change for users that specify "no default features". They'd have to add the wgpu feature explicitly. This seems minor.

Followup work

This PR is huge, because it implements all the existing vello_hybrid features in the WebGL backend. Similarly it also includes build-time shader compilation. Instead of making this change completely impenetrable, I'm splitting test infrastructure into a separate change. This PR must be manually tested in the interim.

The example has been added to CI so that it must compile and run.

@ajakubowicz-canva ajakubowicz-canva force-pushed the ajakubowicz-webgl-refinement branch from feb47e7 to dbcd9cd Compare May 15, 2025 05:45
@ajakubowicz-canva ajakubowicz-canva force-pushed the ajakubowicz-webgl-refinement branch 20 times, most recently from 59e824d to eb2c76a Compare May 16, 2025 01:25
@ajakubowicz-canva ajakubowicz-canva marked this pull request as ready for review May 16, 2025 02:03
@ajakubowicz-canva ajakubowicz-canva requested a review from taj-p May 16, 2025 02:03
@taj-p taj-p self-requested a review May 16, 2025 06:11
Copy link
Contributor

@taj-p taj-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial comments on other parts of code. I ran the demos (native webgl, webgl, and winit). They look great 🥳

@ajakubowicz-canva ajakubowicz-canva force-pushed the ajakubowicz-webgl-refinement branch 2 times, most recently from 71c9a45 to a3f6cf3 Compare May 19, 2025 02:03
@ajakubowicz-canva ajakubowicz-canva force-pushed the ajakubowicz-webgl-refinement branch 3 times, most recently from 987f7de to ef903a9 Compare May 19, 2025 03:31
@ajakubowicz-canva
Copy link
Contributor Author

Note, the cargo doc CI issue is related to a recent nightly toolchain release. I have a mitigation PR that is separate: #1014

@ajakubowicz-canva ajakubowicz-canva requested a review from taj-p May 19, 2025 05:42
Copy link
Contributor

@taj-p taj-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Amazing work 🚀

@ajakubowicz-canva ajakubowicz-canva force-pushed the ajakubowicz-webgl-refinement branch from 27f98bf to 25ea383 Compare May 20, 2025 00:15
 - Add warning if both webgl features are active.
 - Add todos
 - Add new example to CI.
@ajakubowicz-canva ajakubowicz-canva changed the title vello_hybrid: Add native WebGL backend vello_hybrid: add native WebGL backend May 20, 2025
@ajakubowicz-canva
Copy link
Contributor Author

@ajakubowicz-canva ajakubowicz-canva added this pull request to the merge queue May 20, 2025
Merged via the queue into main with commit 04ac809 May 20, 2025
17 checks passed
@ajakubowicz-canva ajakubowicz-canva deleted the ajakubowicz-webgl-refinement branch May 20, 2025 05:36
@linebender linebender deleted a comment from Crimitorii May 20, 2025
github-merge-queue bot pushed a commit that referenced this pull request May 20, 2025
…1016)

# Context

Addresses comment
#1011 (comment)
from @DJMcNab . Replaces `include_str!` with a direct import from
`vello_sparse_shaders` to get access to the wgsl source code.

### Changes

- Adds a `wgsl` module to the build-time module generated by
`vello_sparse_shaders`.

This allows `render_strips.wgsl` to be accessed via:
`vello_sparse_shaders::wgsl::RENDER_STRIPS` instead of a cross-package
`include_str!`.

- Adds a `glsl` feature flag to `vello_sparse_shaders` so the default
compiled module does not include `glsl` unless the `glsl` feature is
used. This makes builds much leaner for wgsl without glsl.

### Test plan

Reviewed generated code, and tested both examples manually:
 - `cargo run_wasm -p wgpu_webgl --release --port 8001`
 - `cargo run_wasm -p native_webgl --release --port 8000`

Finally, CI is very thorough.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants