Enable easy AI workflows

When experimenting with AI use cases and comparing to dedicated frameworks like Gradio, I can see that in some use cases we would simply have to spend too much time on building data apps. We would spend time on 

- understanding how to transform the input provided by Panel widgets into something that is ready for our model 
- understanding how to transform the output of our model into something that is possible to display with Panel
- figuring out that there is actually not a widget or pane supporting our use case

I believe this is the case for any domain. Not just AI. But with AI there are just dedicated frameworks that makes it very clear where our weak points are. An improvement in the AI domain would benefit any domain.

## Examples

For example

- I've just improved the situation for the `Audio` pane such that it supports Torch tensors and more dtypes. This makes it much easier for users that their data format is supported out of the box. Its a bit similar to `hvplot` where we support many data formats out of the box. We could also just tell our users to figure out how to convert their data format to numpy arrays and then they could hvplot. But we don't. We want to make things easy.
- The `FileInput` widget is there. But besides not looking great like an easy to use drag and drop area. We should make it much, much easier for our users to get the file uploaded converted to a `text`, `dataframe`, `audio`, `video` or similar object ready for use. Right now our users would have to spend a lot of time on this task while it should not be their focus. Their focus should be on putting together the input widgets and output panes into a nice layout and then experiment with their model.
- Our users cannot easily experiment with speech recognition because we don't have an `AudioRecorder` widget they can use to easily record texts.

## Solution

Below I will develop an overview of strong and weak points.

## File Input Widget

- [ ] We need a better looking drag and drop `FileInput` widget. A widget that signals you have to `click` and `upload` is simply old school and the workflow is too slow for AI experimentation.
- [ ] Provide easy to use functionality to output to the most used formats. If the user uploads a text file it should be easy to output to text, If the user uploads a csv file it should be very simple to get the `DataFrame`. Same for any format like audio, video etc. These could be methods on the `FileInput`. Besides dedicated method like `.to_text`, `.to_dataframe`, `to_audio`, `to_video` we should also have a magic method `to_object` that just outputs the best guess of something that can be displayed correctly with `pn.panel`. This would make it easy to make more general AI apps where the user can provide many types of input to the AI model/ AI Agent.

An alternative to a do it all FileInput widget, could be dedicated input widgets like `TabularInput`, `AudioInput`, `VideoInput` where the user can drag and drop their file onto for quick experimentation. Gradio does this. For their `Audio` widget the source can be either `upload` or `microphone`.

## Inputs

| domain |  done | existing widgets | missing widgets | comment |
| ---------|--------|-------------------|-------------------|------------|
| audio    |  [ ]       | `FileInput`          | `AudioRecorder`   |                  | 
| image   |  [ ]       | `FileInput`          | `CanvasDraw`/ `Paint` |           |
| tabular  |  [ ]       | `FileInput`, `Tabulator` | | It should be easier to get the `DataFrame` from the `FileInput`. |
| video    |  [ ]       | `FileInput`, `VideoStream` | `VideoRecorder`   | It is not clear to me if it is possible to use the `VideoStream` as a video recorder |

MORE IS COMING

## Outputs

COMING UP

## Examples

We should systematically provide examples of simple input-output style apps for the most used domains. For example `text-to-speech`, `image-to-text`, ..., `text-to-image`, `speech-to-text` etc.

## Additional Context

I started working on the above in 2022 in the separate project https://github.com/marcskovmadsen/paithon. But I got distracted by other things.

I'm also working on the [tranformers agent ui](https://github.com/awesome-panel/transformers-agent-ui). An app like that will also be a good benchmark for how well Panel supports AI workflows as in principle users should be able to work with any input and output media.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Enable easy AI workflows #4861

Examples

Solution

File Input Widget

Inputs

Outputs

Examples

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

domain	done	existing widgets	missing widgets	comment
audio	[ ]	`FileInput`	`AudioRecorder`
image	[ ]	`FileInput`	`CanvasDraw`/ `Paint`
tabular	[ ]	`FileInput`, `Tabulator`		It should be easier to get the `DataFrame` from the `FileInput`.
video	[ ]	`FileInput`, `VideoStream`	`VideoRecorder`	It is not clear to me if it is possible to use the `VideoStream` as a video recorder

Uh oh!

Enable easy AI workflows #4861

Description

Examples

Solution

File Input Widget

Inputs

Outputs

Examples

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions