Skip to content

Drop support for flm_args#2015

Open
superm1 wants to merge 1 commit into
mainfrom
superm1/SWSPLAT-24182
Open

Drop support for flm_args#2015
superm1 wants to merge 1 commit into
mainfrom
superm1/SWSPLAT-24182

Conversation

@superm1
Copy link
Copy Markdown
Member

@superm1 superm1 commented May 26, 2026

Arguments are passed without validation into flm which can cause unintended arguments to override behavior in flm.

There are two options to solve this:

  1. Drop all flm argument support
  2. Whitelist individual arguments that should be passed down.

This aims for the former.

Arguments are passed without validation into `flm` which can cause
unintended arguments to override behavior in `flm`.

There are two options to solve this:
1. Drop all flm argument support
2. Whitelist individual arguments that should be passed down.

This aims for the former.
@fl0rianr
Copy link
Copy Markdown
Collaborator

Given the broader backend-args discussion in #2018, I think this is probably the right security area to tighten.

One thing I’m trying to understand: should flm_args be treated differently from the other *_args surfaces?

For llama.cpp / whisper.cpp / sd.cpp, we’re discussing whether to preserve expert args with stronger validation / expanded deny-listing because a strict allow-list can become a breaking maintenance burden. This PR takes the stricter route for FLM and removes the surface entirely.

That may be the right call if FLM does not have a meaningful set of safe tuning args, or if its argument semantics make validation too brittle. If so, maybe worth stating that explicitly in the PR description so the security rationale is clear and this doesn’t read as an accidental feature drop.

Otherwise, an alternative consistent with the #2018 direction could be to keep flm_args but block Lemonade-managed / dangerous flags like host, port, mode/model selection, path/file-loading flags, and add regression coverage for override attempts.

@superm1
Copy link
Copy Markdown
Member Author

superm1 commented May 26, 2026

Well the difference here is that flm doesn't have meaningful tuning args to pass in the first place though.

So our deny list would pretty much cover them all wouldn't it?

So that's why I lean tear it all out.

Copy link
Copy Markdown
Collaborator

@fl0rianr fl0rianr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thanks for clarifying. I’m good with removing flm_args for now.

Given that FLM does not currently have many meaningful safe tuning args beyond the options Lemonade already manages, removing the surface seems preferable to maintaining a near-complete deny-list.

I would just prefer that we don’t treat this as a permanent policy decision against FLM-specific tuning. If FLM later has a cleaner integration story, especially on Linux where it can be managed more like the other Lemonade backends, I think it should be fine to reintroduce explicit safe options or a narrower expert surface as needed.

Longer term, I’d like FLM not to remain a special case in the backend model. But that’s not a reason to block this PR now.

@bitgamma
Copy link
Copy Markdown
Member

from flm help:

  --prefill-chunk-len arg (=-1)    Set prefill chunk length
  -r [ --img-pre-resize ] arg (=2) Pre-resize the image, 0: original size, 1: 
                                   height = 480, 2: height = 720, 3: height = 
                                   1080, 4: height = 1440
  -s [ --socket ] arg (=10)        Set the maximum number of socket connections allowed (for serve command)
  -q [ --q-len ] arg (=10)         Set number of max npu queue length (for serve command)

are none of these really relevant? especially the first two look like something that could be tuned

@fl0rianr
Copy link
Copy Markdown
Collaborator

These are good points. As I said this is not like a final target situation, a quick security fix. And truly an arg as prefill-chunk-len is great, maybe some of it can be hard-coded in the meanwhile.

@superm1
Copy link
Copy Markdown
Member Author

superm1 commented Jun 3, 2026

I wasn't aware of some of those; some of them might make sense to leave as tunable.
But I'm starting to feel like the real issue is that we allow tuning and loading new models from the same API interface you use to interact with an LLM.

It's one of those don't 💩 where you eat type of things. I'm feeling like we should be having separate or at least segregated configuration. We should be defaulting to a randomly generated API key, not defaulting to open. There should be a different API key for accessing the LLM from configuring lemond.

So yes; I feel like this is step 1 for security to tear out argument support; but it's part of a bigger story that lemond and lemonade need to be locked down by default.

@bitgamma
Copy link
Copy Markdown
Member

bitgamma commented Jun 5, 2026

using a random API key generated during installation will generate friction, but I also think it is the way to go at least for the admin endpoints. No idea how this can be done without the user just thinking the software is broken though.

There should be a different API key for accessing the LLM from configuring lemond.

There is already LEMONADE_ADMIN_API_KEY for this, but if it is not set, it defaults to the "normal" API key (or empty).

Another layer of protection is that lemonade config set only works if the request comes on the local interface, not remotely (not even LAN)

I feel like this is step 1 for security to tear out argument support

Tearing out argument support is throwing the baby out with the bath water imho.

@superm1
Copy link
Copy Markdown
Member Author

superm1 commented Jun 5, 2026

using a random API key generated during installation will generate friction, but I also think it is the way to go at least for the admin endpoints. No idea how this can be done without the user just thinking the software is broken though.

Maybe this needs some sort of of onboarding flow.

Another layer of protection is that lemonade config set only works if the request comes on the local interface, not remotely (not even LAN)

There is also this: #2110

Tearing out argument support is throwing the baby out with the bath water imho.

When you start to chain vulnerabilities due to design decisions of flexibility you sometimes need to.

@github-actions github-actions Bot added engine::flm FastFlowLM backend (NPU); multi-modal LLM/ASR/embeddings/reranking enhancement New feature or request labels Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

engine::flm FastFlowLM backend (NPU); multi-modal LLM/ASR/embeddings/reranking enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants