Skip to content

Add Media Plugin Support (GIFs, WebDAV, etc.)#2500

Open
mozzwald wants to merge 1 commit into
HeliBorg:mainfrom
mozzwald:media-intent
Open

Add Media Plugin Support (GIFs, WebDAV, etc.)#2500
mozzwald wants to merge 1 commit into
HeliBorg:mainfrom
mozzwald:media-intent

Conversation

@mozzwald

Copy link
Copy Markdown

This PR adds a generic media plugin system to HeliBoard.

The goal is to allow optional media sources (GIF search, WebDAV/Nextcloud browsers, etc.) without requiring internet permission or bundled online services inside the keyboard itself.

The keyboard stays offline-only by default, while separate plugin apps handle network access, searching, downloading, authentication, and media browsing. HeliBoard can now discover external apps that expose themselves as media providers using the new plugin intent/API.

Plugins provide:

  • media search/browsing UI data
  • thumbnails/previews
  • content URIs for selected media
  • metadata/mime types

HeliBoard handles:

  • displaying the picker inside the keyboard UI
  • commitContent() into the target app when supported
  • fallback sharing behavior for apps that don't properly support rich content insertion

Keeping the picker inside the IME was important because launching external Activities caused focus loss in apps, which breaks media insertion. The picker is opened by going to the emoji keyboard and on the far right is a new image/media icon.
Screenshot_20260517-120244

Media plugin support must be explicitly enabled by the user in Settings >> Preferences under Emoji (seemed logical fit since the icon is on emoji toolbar). After a plugin installation, the user must enable support for it from "Manage media plugins". The Public Media fallback option was added because I'm using GrapheneOS, and its default SMS/MMS app currently does not accept private content URIs (more info). When enabled, this fallback creates a temporary MediaStore image that any app can access. Since this makes the media publicly accessible to other apps on the device, the feature is optional and disabled by default.

Screenshot_20260517-120307

I also put together two proof-of-concept plugins for testing:

Screenshot_20260517-115654 Screenshot_20260517-115606

And the keyboard branch itself:

The WebDAV plugin was mainly built as a proof of concept to show that the API works for more than just GIF search.

This is likely draft-PR territory, but the overall architecture is working well and I wanted to have the maintainers have a look to make sure it's the right direction.

Tested so far with:

  • GIF insertion (Klipy, Tenor)
  • image insertion from remote WebDAV storage (Nextcloud instance)
  • apps supporting commitContent()
  • fallback send/share behavior for less compatible apps (graphene mms)

Would appreciate feedback on:

Introduce a generic AIDL media provider contract so external plugins can supply image and video media without adding network permissions to HeliBoard. Add provider discovery, search/browse support, thumbnail preview handling, and a hosted in-keyboard picker so IME focus is preserved for commitContent insertion.

Add media insertion plumbing with MIME and size validation, rich-content commitContent support, and ACTION_SEND fallbacks for apps that do not accept rich content. Keep public MediaStore export as an explicit opt-in final fallback for targets that reject private content URIs, while skipping private URI attempts for known-bad targets such as AOSP/Graphene Messaging.

Gate all media entry points behind the Media Plugins setting, default media plugins and public MediaStore fallback to off, expose minimal settings, and add only the emoji-toolbar share entry point. HeliBoard remains permission-neutral with no internet or storage/media permissions added.
@Helium314

Copy link
Copy Markdown
Collaborator

I have to say sorry, but it's going to take quite for me to review this (I don't have enough time for HeliBoard, and the gesture typing project is taking most of it).
So there is no generic way of doing this, but you need plugins specifically for HeliBoard? Do you know of other keyboards using a similar system?

@mozzwald

Copy link
Copy Markdown
Author

I have to say sorry, but it's going to take quite for me to review this (I don't have enough time for HeliBoard, and the gesture typing project is taking most of it).

Understood

So there is no generic way of doing this, but you need plugins specifically for HeliBoard? Do you know of other keyboards using a similar system?

The plugin API here is HeliBoard-specific but is intentionally small to prevent Heliboard from needing anymore permissions. This plugin was designed and restricted to work with Heliboard, but it certainly could be adapted for use with other apps/keyboards if they supported it.

I do not know of another open-source Android keyboard with the same/similar media-provider plugin API. AnySoftKeyboard has an external add-on model, but that is mainly for language packs, layouts, dictionaries, and themes rather than GIF/image/media providers. Florisboard mentions "Integrated extension support" but that it's still evolving and I could only find examples of Theme add ons.

@Helium314

Copy link
Copy Markdown
Collaborator

This plugin was designed and restricted to work with Heliboard, but it certainly could be adapted for use with other apps/keyboards if they supported it.

What do you mean with restricted to work with Heliboard? Like, it requires some specific package name, or it requires the HeliBoard file provider?
I think it would be good if it somehow isn't necessarily hardlinked to Heliboard, but I don't have experience in communication between apps...
btw I noticed you use aidl files, how does an app work with them? Asking because it was also mentioned in #1232, and the "everything you need" link to such files is absolutely useless for me.

It would be great if Florisboard would at least be considering supporting this. Would it require a lot of work on your side to make the plugins compatible, or would this already be possible?

Keeping the picker inside the IME was important because launching external Activities caused focus loss in apps, which breaks media insertion. The picker is opened by going to the emoji keyboard and on the far right is a new image/media icon.

I remember we also had some discussion of inside-IME vs activity for the emoji search, with both not being ideal...
I'll try to give you feedback as requested, but it might come in slowly... anything specific you'd like me to start with?

@mozzwald

Copy link
Copy Markdown
Author

The current HeliBoard-specific restriction is mostly intentional while testing, not something I consider a permanent design requirement. The example plugins are currently targeted at HeliBoard’s draft contract because I needed something concrete to build against. Some package/action names and the copied AIDL files are HeliBoard-specific right now, but the overall idea doesn't have to stay that way.

It may be worth making this a more generic media-provider contract that other keyboards could support too. I did post a link to the PR on a FlorisBoard GIF-related issue, and at least one person there thought it was a good idea. I don't think that was a developer response, though.

My understanding of AIDL is that it is Android's typed IPC layer. In this case, the AIDL file defines the service methods HeliBoard can call on a plugin, such as capability discovery, search/browse, and requesting media content. The method signatures use Bundles, and then the plugin contract defines which keys and values are expected inside those Bundles.

If this were generalized for multiple keyboards, the contract name/action should probably be neutral instead of HeliBoard-specific, and it should include some kind of API version so features can be added later without breaking older hosts or plugins. So basically, HeliBoard, FlorisBoard, or any other keyboard would need to agree on the same contract shape and version.

Keeping the picker inside the IME does add extra UI/code on the keyboard side, so I understand that tradeoff. The reason I went that route is that launching an external picker Activity was much cleaner for the plugin, but it caused focus loss in some target apps and made insertion unreliable. If another keyboard supported the same plugin contract, it would still need to provide its own in-IME picker UI so the target editor keeps focus and commitContent() can work reliably.

I'm not married to this exact architecture. I started it because I personally wanted GIF/WebDAV media support, but I also wanted to avoid adding internet permission or bundled network services to HeliBoard itself. That seemed like a non-starter for upstream, so this was my attempt at keeping the keyboard offline while still allowing optional media sources through separate apps.

To be clear, I see this PR more as a working prototype/proposal than me trying to define and own a permanent media-plugin standard. I built the GIF and WebDAV plugins mainly to prove the direction works. If HeliBoard or other keyboards are interested in this approach, I’d expect the API shape, naming, versioning, and long-term contract details to be refined by the projects that would actually support it.

For review, maybe the best place to start is the concept/API shape rather than all the implementation details: whether AIDL seems acceptable, whether the search/browse/content methods make sense, and whether this should remain HeliBoard-specific or become a more generic keyboard media-provider contract.

@Helium314

Copy link
Copy Markdown
Collaborator

Starting with neutral names would be good, because only aiming for heliboard seems like an unnecessary restriction. Would be great if other keyboards could also add support for the plugins this way. For FlorisBoard the issue is still open, FUTO apparently decided against this (futo-org/android-keyboard#293 (comment)).
So it would just need a neutral variant com.heliboard.intent.MEDIA_PROVIDER and helium314.keyboard.latin.media?
And maybe some API version matching, e.g. in discoverCapabilities or just a version int in the bundles? Or should the version rather be in the method names?

The approach looks good to me, both the use of AIDL and the methods in the interface.

I'm not 100% sure on the GPL license though, because Apache-2.0 Florisboard should be able to add the AIDL files. You didn't specify it explicitly, but the license in HeliBoard and the sample apps point to GPL 3.0

Would appreciate feedback on:

  • API shape

I think it's good, as you said it works well (tested WebDAV only).

  • naming

As discussed, I'd prefer neutral naming not specific to HeliBoard.

  • discovery flow

What do you mean? Querying the providers in the settings screen?

  • security model around plugin trust/approval

Here I'm not sure. At least first glance I think requiring to enable each plugin manually could be enough. What are your considerations here?

Other than that, not sure how much a bad plugin could do. The plugins only receive data intended for them, so this is also fine. Sending manipulated images might happen, but as far as I understand that's something the OS should deal with (as long as we just pass them to the imageview, and then to the app).
Slightly related: I actually had a crash when testing due to large image: Canvas: trying to draw too large(192000000bytes) bitmap.

That definitely!

@Helium314

Copy link
Copy Markdown
Collaborator

I added image functionality to clipboard history. Maybe we can make use of this, e.g. allow copying media from plugins to the internal clipboard, or avoid duplicating paste functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants