Classification & text extraction with LLM and genai in MediaPipe

### MediaPipe Solution (you are using)

genai

### Programming language

Typescript

### Are you willing to contribute it

Yes

### Describe the feature and the current behaviour/state

Use LLMs for more than just text generation: Support Classification and text extraction use cases.

Currently only simple text generation is supported.

### Will this change the current API? How?

_No response_

### Who will benefit with this feature?

_No response_

### Please specify the use cases for this feature

Would be awesome if you could improve the API of MediaPipe so LLMs (like Gemma 3) can not only be used for text generation, but also for more tasks. Especially when they are fine tuned for following instructions (e.g. `gemma-3-270m-it`).

Rough needed change to support text extraction use-case and classification use-case: 

- Text extraction: Enable structured output via a supplied JSON schema.
- Classification: Add probabilities of next token. See also https://ai.google.dev/gemma/docs/agile_classifiers for more details.

This is especially useful, as one LLM can be used for multiple tasks. So edge device does not have to load many models for different tasks. Especially in Web context this can be very useful as resources are scarce. 

### Any Other info

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Classification & text extraction with LLM and genai in MediaPipe #6194

MediaPipe Solution (you are using)

Programming language

Are you willing to contribute it

Describe the feature and the current behaviour/state

Will this change the current API? How?

Who will benefit with this feature?

Please specify the use cases for this feature

Any Other info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Classification & text extraction with LLM and genai in MediaPipe #6194

Description

MediaPipe Solution (you are using)

Programming language

Are you willing to contribute it

Describe the feature and the current behaviour/state

Will this change the current API? How?

Who will benefit with this feature?

Please specify the use cases for this feature

Any Other info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions