diff --git a/docs/lemonade/server_integration.md b/docs/lemonade/server_integration.md index 61a5dfd4..f678ac92 100644 --- a/docs/lemonade/server_integration.md +++ b/docs/lemonade/server_integration.md @@ -10,6 +10,27 @@ The first part of this guide contains instructions that are common for both inte ## General Instructions +### Identifying Existing Installation + +To identify if Lemonade Server is installed on a system, you can use the `lemonade-server` CLI command, which is added to path when using our installer. This is a reliable method to: +- Verify if the server is installed. +- Check which version is currently available is running the command below. + +``` +lemonade-server --version +``` + +>Note: The `lemonade-server` CLI command is added to PATH when using the Windows Installer (Lemonade_Server_Installer.exe). For Linux users or Python development environments, the command `lemonade-server-dev` is available when installing via pip. + +### Checking Server Status + +To identify whether or not the server is running anywhere on the system you may use the `status` command of `lemonade-server`. + +``` +lemonade-server status +``` + +This command will return either `Server is not running` or `Server is running on port `. ### Identifying Compatible Devices @@ -54,8 +75,18 @@ Please note that the Server Installer is only available on Windows. Apps that in Some apps might prefer to be responsible for installing and managing Lemonade Server on behalf of the user. This part of the guide includes steps for installing and running Lemonade Server so that your users don't have to install Lemonade Server separately. Definitions: -- "Silent installation" refers to an automatic command for installing Lemonade Server without running any GUI or prompting the user for any questions. It does assume that the end-user fully accepts the license terms, so be sure that your own application makes this clear to the user. - Command line usage allows the server process to be launched programmatically, so that your application can manage starting and stopping the server process on your user's behalf. +- "Silent installation" refers to an automatic command for installing Lemonade Server without running any GUI or prompting the user for any questions. It does assume that the end-user fully accepts the license terms, so be sure that your own application makes this clear to the user. + +### Command Line Invocation + +This command line invocation starts the Lemonade Server process so that your application can connect to it via REST API endpoints. To start the server, simply run the command below. + +```bash +lemonade-server serve +``` + +You can also run the server as a background process using a subprocess or any preferred method. ### Silent Installation @@ -97,37 +128,3 @@ The available modes are the following: * `Qwen-1.5-7B-Chat-Hybrid` * `DeepSeek-R1-Distill-Llama-8B-Hybrid` * `DeepSeek-R1-Distill-Qwen-7B-Hybrid` - -### Command Line Invocation - -Command line invocation starts the Lemonade Server process so that your application can connect to it via REST API endpoints. - -#### Foreground Process - -These steps will open the Lemonade Server in a terminal window that is visible to users. The user can exit the server by closing the window. - -In a `cmd.exe` terminal: - -```bash -conda run --no-capture-output -p INSTALL_DIR\lemonade_server\lemon_env lemonade serve -``` - -Where `INSTALL_DIR` is the installation path of `lemonade_server`. - -For example, if you used the default installation directory and your username is USERNAME: - -```bash -C:\Windows\System32\cmd.exe /C conda run --no-capture-output -p C:\Users\USERNAME\AppData\Local\lemonade_server\lemon_env lemonade serve -``` - -#### Background Process - -This command will open the Lemonade Server without opening a window. Your application needs to manage terminating the process and any child processes it creates. - -In a powershell terminal: - -```powershell -$serverProcess = Start-Process -FilePath "C:\Windows\System32\cmd.exe" -ArgumentList "/C conda run --no-capture-output -p INSTALL_DIR\lemonade_server\lemon_env lemonade serve" -RedirectStandardOutput lemonade_out.txt -RedirectStandardError lemonade_err.txt -PassThru -NoNewWindow -``` - -Where `INSTALL_DIR` is the installation path of `lemonade_server`. diff --git a/installer/Installer.nsi b/installer/Installer.nsi index 0195e8c7..877e10c6 100644 --- a/installer/Installer.nsi +++ b/installer/Installer.nsi @@ -95,7 +95,11 @@ SectionIn RO ; Read only, always installed # Pack turnkeyml repo into the installer # Exclude hidden files (like .git, .gitignore) and the installation folder itself - File /r /x nsis.exe /x installer /x .* /x *.pyc /x docs /x examples /x utilities ..\*.* run_server.bat + File /r /x nsis.exe /x installer /x .* /x *.pyc /x docs /x examples /x utilities ..\*.* lemonade_server.bat + + # Create bin directory and move lemonade_server.bat there + CreateDirectory "$INSTDIR\bin" + Rename "$INSTDIR\lemonade_server.bat" "$INSTDIR\bin\lemonade_server.bat" DetailPrint "- Packaged repo" @@ -196,7 +200,18 @@ SectionIn RO ; Read only, always installed DetailPrint "*** INSTALLATION COMPLETED ***" # Create a shortcut inside $INSTDIR - CreateShortcut "$INSTDIR\lemonade-server.lnk" "$SYSDIR\cmd.exe" "/C conda run --no-capture-output -p $INSTDIR\$LEMONADE_CONDA_ENV lemonade serve" "$INSTDIR\img\favicon.ico" + CreateShortcut "$INSTDIR\lemonade-server.lnk" "$INSTDIR\bin\lemonade_server.bat" "serve --keep-alive" "$INSTDIR\img\favicon.ico" + + ; Add bin folder to system PATH + DetailPrint "- Adding bin directory to system PATH..." + + ; Get the current PATH value from the registry + ReadRegStr $0 HKLM "SYSTEM\CurrentControlSet\Control\Session Manager\Environment" "PATH" + + ; Add bin folder (containing 'lemonade-server') to path while avoiding duplicate entries + ExecWait 'setx PATH "$INSTDIR\bin;$0" -m' + + DetailPrint "- Successfully updated system PATH" Goto end @@ -298,7 +313,7 @@ SubSectionEnd Section "-Add Desktop Shortcut" ShortcutSec ; Create a desktop shortcut that passes the conda environment name as a parameter - CreateShortcut "$DESKTOP\lemonade-server.lnk" "$INSTDIR\run_server.bat" "$LEMONADE_CONDA_ENV" "$INSTDIR\img\favicon.ico" + CreateShortcut "$DESKTOP\lemonade-server.lnk" "$INSTDIR\bin\lemonade_server.bat" "serve --keep-alive" "$INSTDIR\img\favicon.ico" SectionEnd @@ -551,5 +566,7 @@ Function .onInit ${EndIf} ${EndIf} + ; Call onSelChange to ensure initial model selection state is correct + Call .onSelChange FunctionEnd \ No newline at end of file diff --git a/installer/lemonade_server.bat b/installer/lemonade_server.bat new file mode 100644 index 00000000..06186805 --- /dev/null +++ b/installer/lemonade_server.bat @@ -0,0 +1,33 @@ +@echo off +setlocal enabledelayedexpansion +set CONDA_ENV=lemon_env + +REM --keep-alive is only used by the bash script to make sure that, if the server fails to open, we don't close the terminal right away. +REM Check for --keep-alive argument and remove it from arguments passed to CLI +set KEEP_ALIVE=0 +set ARGS= +for %%a in (%*) do ( + if /I "%%a"=="--keep-alive" ( + set KEEP_ALIVE=1 + ) else ( + set ARGS=!ARGS! %%a + ) +) + +REM Change to parent directory where conda env and bin folders are located +pushd "%~dp0.." + +REM Run the Python CLI script through conda, passing filtered arguments +call conda run --no-capture-output -p "%CD%\%CONDA_ENV%" lemonade-server-dev !ARGS! +popd + +REM Error handling: Show message and pause if --keep-alive was specified +if %ERRORLEVEL% neq 0 ( + if %KEEP_ALIVE%==1 ( + echo. + echo An error occurred while running Lemonade Server. + echo Please check the error message above. + echo. + pause + ) +) diff --git a/setup.py b/setup.py index ea49faee..6ddbc9f7 100644 --- a/setup.py +++ b/setup.py @@ -9,7 +9,10 @@ version=version, description="TurnkeyML Tools and Models", author_email="turnkeyml@amd.com", - package_dir={"": "src", "turnkeyml_models": "models"}, + package_dir={ + "": "src", + "turnkeyml_models": "models", + }, packages=[ "turnkeyml", "turnkeyml.tools", @@ -30,6 +33,7 @@ "turnkeyml_models.torchvision", "turnkeyml_models.transformers", "lemonade_install", + "lemonade_server", ], install_requires=[ "invoke>=2.0.0", @@ -109,6 +113,7 @@ "turnkey-llm=lemonade:lemonadecli", "lemonade=lemonade:lemonadecli", "lemonade-install=lemonade_install:installcli", + "lemonade-server-dev=lemonade_server.cli:main", ] }, python_requires=">=3.8, <3.12", diff --git a/src/lemonade/tools/serve.py b/src/lemonade/tools/serve.py index c9f449e1..2f904bfc 100644 --- a/src/lemonade/tools/serve.py +++ b/src/lemonade/tools/serve.py @@ -242,8 +242,8 @@ def parser(add_help: bool = True) -> argparse.ArgumentParser: def run( self, - cache_dir: str, - checkpoint: str, + cache_dir: str = DEFAULT_CACHE_DIR, + checkpoint: str = None, max_new_tokens: int = DEFAULT_MAX_NEW_TOKENS, port: int = DEFAULT_PORT, log_level: str = DEFAULT_LOG_LEVEL, diff --git a/src/lemonade_server/cli.py b/src/lemonade_server/cli.py new file mode 100644 index 00000000..667d48b2 --- /dev/null +++ b/src/lemonade_server/cli.py @@ -0,0 +1,127 @@ +import argparse +import sys +import os +import psutil + + +def serve(args): + """ + Execute the serve command + """ + + # Check if Lemonade Server is already running + running_on_port = get_server_port() + if running_on_port is not None: + print( + ( + f"Lemonade Server is already running on port {running_on_port}\n" + "Please stop the existing server before starting a new instance." + ), + ) + return + + # Otherwise, start the server + print("Starting Lemonade Server...") + from lemonade.tools.serve import Server + + server = Server() + server.run() + + +def version(): + """ + Print the version number + """ + from turnkeyml import __version__ as version_number + + print(f"Lemonade Server version is {version_number}") + + +def status(): + """ + Print the status of the server + """ + port = get_server_port() + if port is None: + print("Server is not running") + else: + print(f"Server is running on port {port}") + + +def is_lemonade_server(pid): + """ + Check wether or not a given PID corresponds to a Lemonade server + """ + try: + process = psutil.Process(pid) + while True: + if process.name() in [ # Windows + "lemonade-server-dev.exe", + "lemonade-server.exe", + "lemonade.exe", + ] or process.name() in [ # Linux + "lemonade-server-dev", + "lemonade-server", + "lemonade", + ]: + return True + if not process.parent(): + return False + process = process.parent() + except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess): + return False + return False + + +def get_server_port() -> int | None: + """ + Get the port that Lemonade Server is running on + """ + # Go over all python processes that have a port open + for process in psutil.process_iter(["pid", "name"]): + try: + connections = process.net_connections() + for conn in connections: + if conn.status == "LISTEN": + if is_lemonade_server(process.info["pid"]): + return conn.laddr.port + except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess): + continue + + return None + + +def main(): + parser = argparse.ArgumentParser( + description="Serve LLMs on CPU, GPU, and NPU.", + usage=argparse.SUPPRESS, + ) + + # Add version flag + parser.add_argument( + "-v", "--version", action="store_true", help="Show version number" + ) + + # Create subparsers for commands + subparsers = parser.add_subparsers( + title="Available Commands", dest="command", metavar="" + ) + + # Serve commands + serve_parser = subparsers.add_parser("serve", help="Start server") + status_parser = subparsers.add_parser("status", help="Check if server is running") + + args = parser.parse_args() + + if args.version: + version() + elif args.command == "serve": + serve(args) + elif args.command == "status": + status() + elif args.command == "help" or not args.command: + parser.print_help() + + +if __name__ == "__main__": + main() diff --git a/src/turnkeyml/version.py b/src/turnkeyml/version.py index 7f8a859d..0237e6dc 100644 --- a/src/turnkeyml/version.py +++ b/src/turnkeyml/version.py @@ -1 +1 @@ -__version__ = "6.1.1" +__version__ = "6.1.2" diff --git a/test/lemonade/server_cli.py b/test/lemonade/server_cli.py new file mode 100644 index 00000000..75b3a59b --- /dev/null +++ b/test/lemonade/server_cli.py @@ -0,0 +1,101 @@ +""" +Usage: python server_cli.py + +This will launch the lemonade server and test the CLI. + +If you get the `ImportError: cannot import name 'TypeIs' from 'typing_extensions'` error: + 1. pip uninstall typing_extensions + 2. pip install openai +""" + +import unittest +import subprocess +import asyncio +import socket +import time +from threading import Thread +import sys +import io +import httpx +from server import kill_process_on_port, PORT +from turnkeyml import __version__ as version_number + +try: + from openai import OpenAI, AsyncOpenAI +except ImportError as e: + raise ImportError("You must `pip install openai` to run this test", e) + + +class Testing(unittest.IsolatedAsyncioTestCase): + def setUp(self): + """ + Start lemonade server process + """ + print("\n=== Starting new test ===") + + # Ensure we kill anything using the test port before and after the test + kill_process_on_port(PORT) + self.addCleanup(kill_process_on_port, PORT) + + def test_001_version(self): + result = subprocess.run( + ["lemonade-server-dev", "--version"], capture_output=True, text=True + ) + + # Check that the stdout ends with the version number (some apps rely on this) + assert result.stdout.strip().endswith( + version_number + ), f"Expected stdout to end with '{version_number}', but got: '{result.stdout}'" + + def test_002_serve_and_status(self): + + # First, ensure we can correctly detect that the server is not running + result = subprocess.run( + ["lemonade-server-dev", "status"], + capture_output=True, + text=True, + ) + assert ( + result.stdout == "Server is not running\n" + ), f"{result.stdout} {result.stderr}" + + # Now, start the server + process = subprocess.Popen( + ["lemonade-server-dev", "serve"], + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + text=True, + ) + + # Wait for the server to start by checking the port + start_time = time.time() + while True: + if time.time() - start_time > 60: + raise TimeoutError("Server failed to start within 60 seconds") + try: + conn = socket.create_connection(("localhost", PORT)) + conn.close() + break + except socket.error: + time.sleep(1) + + # Wait a few other seconds after the port is available + time.sleep(20) + + # Now, ensure we can correctly detect that the server is running + result = subprocess.run( + ["lemonade-server-dev", "status"], + capture_output=True, + text=True, + ) + assert ( + result.stdout == f"Server is running on port {PORT}\n" + ), f"Expected stdout to end with '{PORT}', but got: '{result.stdout}' {result.stderr}" + + # Close the server + process.terminate() + process.wait() + + +if __name__ == "__main__": + unittest.main()