-
Notifications
You must be signed in to change notification settings - Fork 370
feat: create trait definitions for model and streamable model #833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
parzivale
wants to merge
14
commits into
cloudflare:main
Choose a base branch
from
parzivale:ai-traits
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 13 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
8b9022f
Added inital support for AI traits
parzivale d61679d
Merge remote-tracking branch 'upstream/main' into ai-traits
parzivale 3607ee4
added test case for new ai feature
parzivale ac6f9fa
Merge remote-tracking branch 'upstream/main' into ai-traits
parzivale 0a1989f
feat: added ai model input and output structs
parzivale 442eff7
fix/feat: moved to js backed types rather than rust structs
parzivale 5654e9a
feat: added rolescoped chat interface
parzivale baecae1
tweak: moved example to use worker type
parzivale 4080689
fix: potentially fixes ci errors, accidental extra slash
parzivale 85d2b86
Merge remote-tracking branch 'upstream/main' into ai-traits
parzivale 1ce8ff3
feat: moved to typed array for chat input
parzivale 5b5310f
fix: cleaned up unused modules now that the RoleScopedChatInputArray …
parzivale af555fb
fix: moved away from using reference to self in associated methods
parzivale 409b221
feat: added typed array builder
parzivale File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| use worker::{ | ||
| models::llama_4_scout_17b_16e_instruct::Llama4Scout17b16eInstruct, | ||
| worker_sys::AiTextGenerationInput, Env, Request, Response, Result, | ||
| }; | ||
|
|
||
| use crate::SomeSharedData; | ||
|
|
||
| const AI_TEST: &str = "AI_TEST"; | ||
|
|
||
| #[worker::send] | ||
| pub async fn simple_ai_text_generation( | ||
| _: Request, | ||
| env: Env, | ||
| _data: SomeSharedData, | ||
| ) -> Result<Response> { | ||
| let ai = env | ||
| .ai(AI_TEST)? | ||
| .run::<Llama4Scout17b16eInstruct>( | ||
| AiTextGenerationInput::new() | ||
| .set_prompt("What is the answer to life the universe and everything?"), | ||
| ) | ||
| .await?; | ||
| Response::ok(ai.get_response().unwrap_or_default()) | ||
| } | ||
|
|
||
| #[worker::send] | ||
| pub async fn streaming_ai_text_generation( | ||
| _: Request, | ||
| env: Env, | ||
| _data: SomeSharedData, | ||
| ) -> Result<Response> { | ||
| let stream = env | ||
| .ai(AI_TEST)? | ||
| .run_streaming::<Llama4Scout17b16eInstruct>( | ||
| AiTextGenerationInput::new() | ||
| .set_prompt("What is the answer to life the universe and everything?"), | ||
| ) | ||
| .await?; | ||
|
|
||
| Response::from_stream(stream) | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| import { describe, expect, test } from "vitest"; | ||
| import { mf, mfUrl } from "./mf"; | ||
|
|
||
| async function runTest() { | ||
| let normal_response = await mf.dispatchFetch(`${mfUrl}ai`); | ||
| expect(normal_response.status).toBe(200); | ||
|
|
||
| let streaming_response = await mf.dispatchFetch(`${mfUrl}ai/streaming`); | ||
| expect(streaming_response.status).toBe(200); | ||
| } | ||
| describe("ai", runTest); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| pub mod typed_array; |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than putting these on the array, let's put these on the message builder itself.
So we would rather use directly:
We may yet go either way on array generics - wrapper types or not. I don't want to assume one direction yet though. We definitely want builders at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually a generic array builder would be a useful pattern I suppose, but the trick with this would be constructing a builder API that can in theory support optimized creation of the whole array in a single buffer serialization (as an optimization, we don't actually have to do it yet).
That is, something more like:
where the
push_buildermethod onTypedArray<T>is works forT impl Builderfor some definition of a builder trait?Or something along those lines. Let me know if that make sense where I'm trying to go with this?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll create a generic array builder. I'm a little confused with the single buffer serialization optimization as the items in the array are jsvalues and are not serializeable
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also currently the generic array isn't really a generic array just a helper macro which generates an array with typed helper methods, I can refactor to make a proper generic array. Though that might come with complications as I would need to cast every time an access is made rather than just use wasm-bindgen to define the return type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In due course we will be getting a generic array upstream, just using that term here for now. I understand your macro is a custom type and I'm fine with this approach unflagged with the assumption that the change won't be too destructive down the line.
If we could do a generic array that would in theory be preferable (
TypedArray<T>wrapping Array), but I've set enough requirements on this PR here I feel! Separately I would be interested to dig into your cast concern more as I didn't follow that, and it would help the upstream conversation there.As for the optimization - the idea is that every call from Wasm to JS incurs a performance overhead. If we can batch those calls through an optimization as a single call that should be more performant, turning an array of N values from N calls out to JS to 1. Again not a hard constraint, just sharing the design space thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically just that a Rust builder could theoretically implement such optimizations, but we don't actually need to do them now of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the casting concerns that was due to me misunderstanding what you wanted and assuming you wanted a new array wrapper which looked like this
Which wrapped all the associated methods on array with an extra casting step to T.
Regarding the optimization, I see what you mean I got a little confused due to the serialized wording and assumed you wanted something like a serde_wasm_bindgen call as part of the array builder build call.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed an implementation of this, I tried to keep the bulk move to js in mind when constructing the builder, the push functionality to the array builder should probably have some refinement but I wanted to get it in front of your eyes first. Finally I would like to move the typedArray macro over to a proc macro before merging as I'm reaching for dirtier and dirtier hacks to avoid polluting the name space (that's why the exported typed array is referenced as RoleScopedChatInputArray::RoleScopedChatInputArray