Add Lium-backed inference for sampling (opt-in, no sampling logic changes) #165

surcyf123 · 2025-11-13T00:21:44Z

Route sampling inference to ephemeral Lium GPU pods while keeping sampling, pre- and post-processing unchanged.
New affine/lium_backend.py: provisions a pod, starts a minimal OpenAI-style model server, exposes via public port or SSH tunnel.
affine/tasks.py: when AFFINE_USE_LIUM=1, set base_url to the Lium server; relax CHUTES_API_KEY requirement.
affine/miners.py: bypass Chutes lookups/filters in Lium mode only.
Usage:
export LIUM_API_KEY=...
export AFFINE_USE_LIUM=1
Optional: export AFFINE_LIUM_GPU="H200,A100"
Run existing commands unchanged (e.g., python -m affine.cli runner).
Compatibility: Default behavior unchanged (Chutes path). Lium is strictly opt-in via env var.

…ffine/lium_backend.py: provision Lium pod, host minimal OpenAI-style server, expose via public port or SSH tunnel\n- tasks: in AFFINE_USE_LIUM mode, route Evaluator payload base_url to Lium server; relax CHUTES_API_KEY requirement\n- miners: bypass Chutes lookups/filters when AFFINE_USE_LIUM=1\n\nNo changes to sampling/pre/post logic; only inference backend swapped when enabled.

Copilot

Pull Request Overview

This PR introduces an opt-in Lium GPU backend for model inference while maintaining backward compatibility with the existing Chutes infrastructure. When AFFINE_USE_LIUM=1 is set, the system provisions ephemeral GPU pods on Lium, starts an OpenAI-compatible model server, and routes sampling requests through either public URLs or SSH tunnels.

Key Changes

Adds affine/lium_backend.py with pod provisioning, model server deployment, and SSH tunnel management
Modifies affine/tasks.py to conditionally route inference requests to Lium servers and relax API key requirements
Updates affine/miners.py to bypass Chutes validation checks when operating in Lium mode

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 22 comments.

File	Description
affine/lium_backend.py	New module implementing Lium pod provisioning, inline OpenAI-compatible server creation, SSH tunnel management, and resource cleanup
affine/tasks.py	Conditional routing to Lium backend via `ensure_model_server()` and relaxed environment variable requirements in Lium mode
affine/miners.py	Bypass Chutes metadata validation (chute name, revision, affine prefix checks) when `AFFINE_USE_LIUM=1` is enabled

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-13T00:28:22Z

affine/lium_backend.py

+    pod = _LIUM.wait_ready(pod_dict, timeout=900)
+    if not pod:
+        raise RuntimeError("Lium pod failed to become ready")
+


[nitpick] The 900-second (15-minute) timeout for wait_ready is quite long and could cause the application to hang for extended periods if pod provisioning fails. Consider either:

Reducing the timeout to a more reasonable value (e.g., 300 seconds / 5 minutes)

Adding logging to inform users that pod provisioning is in progress

Making the timeout configurable via an environment variable

Additionally, the error message "Lium pod failed to become ready" doesn't provide details about why it failed, making debugging difficult.

Suggested change

pod = _LIUM.wait_ready(pod_dict, timeout=900)

if not pod:

raise RuntimeError("Lium pod failed to become ready")

timeout = int(os.getenv("AFFINE_LIUM_POD_TIMEOUT", "300"))

print(f"Provisioning Lium pod for model '{model}' (timeout: {timeout}s)...")

pod = _LIUM.wait_ready(pod_dict, timeout=timeout)

if not pod:

pod_info = pod_dict if pod_dict else {}

raise RuntimeError(

f"Lium pod failed to become ready after {timeout}s. "

f"Pod info: {pod_info}"

)

Copilot · 2025-11-13T00:28:23Z

affine/lium_backend.py

+    ssh_cmd = [
+        "ssh",
+        "-i", key,
+        "-N",
+        "-o", "StrictHostKeyChecking=no",


[nitpick] Security concern: The SSH command uses -o StrictHostKeyChecking=no, which disables host key verification. This makes the connection vulnerable to man-in-the-middle attacks. While this may be acceptable for ephemeral pods with dynamic IPs, consider documenting this security trade-off or providing an option to use a known_hosts file if the Lium infrastructure uses stable host keys.

Suggested change

ssh_cmd = [

"ssh",

"-i", key,

"-N",

"-o", "StrictHostKeyChecking=no",

# SECURITY NOTE: By default, disables SSH host key verification for ephemeral pods.

# If your infrastructure uses stable host keys, set AFFINE_KNOWN_HOSTS_PATH to a known_hosts file

# to enable verification and mitigate MITM risks.

known_hosts_path = os.environ.get("AFFINE_KNOWN_HOSTS_PATH")

ssh_cmd = [

"ssh",

"-i", key,

"-N",

]

if known_hosts_path:

ssh_cmd += [

"-o", f"UserKnownHostsFile={known_hosts_path}",

"-o", "StrictHostKeyChecking=yes",

]

else:

ssh_cmd += [

"-o", "StrictHostKeyChecking=no",

]

ssh_cmd += [

Copilot · 2025-11-13T00:28:23Z

affine/tasks.py

+        if os.getenv("AFFINE_USE_LIUM") == "1":
+            return {}


[nitpick] When AFFINE_USE_LIUM=1, this method returns an empty dictionary and skips the CHUTES_API_KEY validation entirely. However, if the Lium backend is being used alongside Chutes (e.g., in a mixed environment), or if the env_vars are needed for other purposes beyond authentication, this could cause issues. Consider documenting this behavior or providing a more targeted exemption that only skips the specific validation that's not needed for Lium mode.

Copilot · 2025-11-13T00:28:23Z

affine/tasks.py

+                from affine.lium_backend import ensure_model_server
+                srv = ensure_model_server(miner.model)
+                payload["base_url"] = srv.base_url


[nitpick] The import statement from affine.lium_backend import ensure_model_server is inside the conditional block, which is good for avoiding import errors when Lium is not used. However, if ensure_model_server raises an exception during execution, the error handling context may be unclear. Consider adding a try-except block with a descriptive error message:

if os.getenv("AFFINE_USE_LIUM") == "1": try: from affine.lium_backend import ensure_model_server srv = ensure_model_server(miner.model) payload["base_url"] = srv.base_url except Exception as e: raise RuntimeError(f"Failed to initialize Lium backend: {e}") from e else: payload["base_url"] = f"https://{miner.slug}.chutes.ai/v1"

Suggested change

from affine.lium_backend import ensure_model_server

srv = ensure_model_server(miner.model)

payload["base_url"] = srv.base_url

try:

from affine.lium_backend import ensure_model_server

srv = ensure_model_server(miner.model)

payload["base_url"] = srv.base_url

except Exception as e:

raise RuntimeError(f"Failed to initialize Lium backend: {e}") from e

Copilot · 2025-11-13T00:28:23Z

affine/lium_backend.py

+
+
+def ensure_model_server(model: str) -> ModelServer:
+    """Return a live server for model, provisioning if necessary."""


Missing input validation for the model parameter. If model is empty or None, it could lead to accessing the wrong cached server or creating an invalid cache entry. Consider adding validation at the start of the function:

if not model: raise ValueError("model parameter cannot be empty")

Suggested change

"""Return a live server for model, provisioning if necessary."""

"""Return a live server for model, provisioning if necessary."""

if not model:

raise ValueError("model parameter cannot be empty")

Copilot · 2025-11-13T00:28:27Z

affine/lium_backend.py

+def _gen(prompt: str, max_new: int, temp: float) -> str:
+    encoded = tok(prompt, return_tensors="pt").to(device)
+    with torch.inference_mode():
+        out = model.generate(
+            **encoded,
+            do_sample=(temp > 0),
+            temperature=max(1e-5, temp),
+            max_new_tokens=max_new,
+            pad_token_id=tok.pad_token_id,
+        )
+    return tok.decode(out[0], skip_special_tokens=True)


The inline server code has a potential issue with the generation logic for decoder-only models. For causal language models, the decode operation will include the input prompt in the output, but the code doesn't strip it. This means the response will contain both the prompt and the completion, which may not match the expected OpenAI API behavior. Consider updating the decoder logic:

def _gen(prompt: str, max_new: int, temp: float) -> str: encoded = tok(prompt, return_tensors="pt").to(device) input_length = encoded.input_ids.shape[1] with torch.inference_mode(): out = model.generate( **encoded, do_sample=(temp > 0), temperature=max(1e-5, temp), max_new_tokens=max_new, pad_token_id=tok.pad_token_id, ) # For causal models, skip the input tokens decoded = tok.decode(out[0][input_length:], skip_special_tokens=True) return decoded

Copilot · 2025-11-13T00:28:27Z

affine/lium_backend.py

+def _choose_executor() -> ExecutorInfo:
+    """Choose a Lium executor, preferring GPU types from AFFINE_LIUM_GPU."""
+    prefs = (os.getenv("AFFINE_LIUM_GPU") or "").split(",")
+    prefs = [p.strip().upper() for p in prefs if p.strip()]
+    exs = _LIUM.ls()
+    if prefs:
+        for p in prefs:
+            for e in exs:
+                if e.gpu_type.upper().startswith(p):
+                    return e
+    # Prefer docker-in-docker for easier server setups
+    for e in exs:
+        if getattr(e, "docker_in_docker", False):
+            return e
+    return exs[0]


The executor list could be empty, which would cause an IndexError on line 61 (return exs[0]). Add validation to handle this case:

if not exs: raise RuntimeError("No Lium executors available") # Prefer docker-in-docker for easier server setups for e in exs: if getattr(e, "docker_in_docker", False): return e return exs[0]

Copilot · 2025-11-13T00:28:27Z

affine/lium_backend.py

+import atexit
+import os
+import random
+import shlex


Import of 'shlex' is not used.

Suggested change

import shlex

Copilot · 2025-11-13T00:28:27Z

affine/lium_backend.py

+        for port, proc in list(_SSH_TUNNELS.items()):
+            try:
+                proc.terminate()
+            except Exception:


'except' clause does nothing but pass and there is no explanatory comment.

Copilot · 2025-11-13T00:28:27Z

affine/lium_backend.py

+        for model, srv in list(_MODEL_TO_SERVER.items()):
+            try:
+                _LIUM._request("DELETE", f"/pods/{srv.pod.id}")
+            except Exception:


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except Exception:

except Exception:

# Ignore errors during pod deletion; best-effort cleanup.

Copilot AI review requested due to automatic review settings November 13, 2025 00:21

Copilot started reviewing on behalf of surcyf123 November 13, 2025 00:22 View session

Copilot finished reviewing on behalf of surcyf123 November 13, 2025 00:25

Copilot AI reviewed Nov 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Lium-backed inference for sampling (opt-in, no sampling logic changes) #165

Add Lium-backed inference for sampling (opt-in, no sampling logic changes) #165

Uh oh!

surcyf123 commented Nov 13, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Copilot AI Nov 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-    pod = _LIUM.wait_ready(pod_dict, timeout=900)
-    if not pod:
-        raise RuntimeError("Lium pod failed to become ready")
+    timeout = int(os.getenv("AFFINE_LIUM_POD_TIMEOUT", "300"))
+    print(f"Provisioning Lium pod for model '{model}' (timeout: {timeout}s)...")
+    pod = _LIUM.wait_ready(pod_dict, timeout=timeout)
+    if not pod:
+        pod_info = pod_dict if pod_dict else {}
+        raise RuntimeError(
+            f"Lium pod failed to become ready after {timeout}s. "
+            f"Pod info: {pod_info}"
+        )

-    ssh_cmd = [
-        "ssh",
-        "-i", key,
-        "-N",
-        "-o", "StrictHostKeyChecking=no",
+    # SECURITY NOTE: By default, disables SSH host key verification for ephemeral pods.
+    # If your infrastructure uses stable host keys, set AFFINE_KNOWN_HOSTS_PATH to a known_hosts file
+    # to enable verification and mitigate MITM risks.
+    known_hosts_path = os.environ.get("AFFINE_KNOWN_HOSTS_PATH")
+    ssh_cmd = [
+        "ssh",
+        "-i", key,
+        "-N",
+    ]
+    if known_hosts_path:
+        ssh_cmd += [
+            "-o", f"UserKnownHostsFile={known_hosts_path}",
+            "-o", "StrictHostKeyChecking=yes",
+        ]
+    else:
+        ssh_cmd += [
+            "-o", "StrictHostKeyChecking=no",
+        ]
+    ssh_cmd += [



		def ensure_model_server(model: str) -> ModelServer:
		"""Return a live server for model, provisioning if necessary."""

	except Exception:
	except Exception:
	# Ignore errors during pod deletion; best-effort cleanup.

Add Lium-backed inference for sampling (opt-in, no sampling logic changes) #165

Are you sure you want to change the base?

Add Lium-backed inference for sampling (opt-in, no sampling logic changes) #165

Uh oh!

Conversation

surcyf123 commented Nov 13, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key Changes

Reviewed Changes

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant