Skip to content

Client Server Protocol

Brad Colbert edited this page Jun 16, 2025 · 1 revision

YAIL Protocol Specification

Overview

YAIL (yail.py) is a Python server that streams images over a custom TCP protocol, supporting both static files, camera frames, AI-generated images, and internet image search results. It is designed for low-latency, retro hardware-friendly, binary image transmission to clients, optionally with palette data for advanced graphics modes.

The protocol is simple, line-oriented for command input, and binary for image output. It supports multiple graphics modes, error reporting, and runtime configuration.


Connection and Session

  • Transport: TCP socket (default port 5556)
  • Client/Server Model: The server listens for connections. Each client connection is handled in a dedicated thread.
  • Timeout: 5 minutes (300 seconds) per connection.

Supported Commands

Clients communicate with the server by sending text commands. Each command is a space-separated string, terminated by newline (\n or \r\n). The server responds typically by streaming image data, sending a status message, or both.

Command List

Command Description
video Capture and stream a frame from the camera.
search <term> Search for images using DuckDuckGo and stream a random result.
gen <model> <prompt> Generate an image with the specified model and prompt, then stream it.
gen <prompt> Generate an image with the default model and prompt.
gen-gemini <prompt> Generate an image using Gemini and stream it.
files Stream a random image from the loaded local files.
next Repeat the last operation (e.g., next image in current mode).
gfx <mode> Set the graphics mode (8, 9, or 16 for VBXE).
openai-config <param> <value> Set runtime options for OpenAI generation (model, size, quality, etc.).
quit End the session and close the connection.

Command Details

  • video: Captures a single frame from the camera and sends it as an image.
  • search <term>: Uses DuckDuckGo to find images related to <term>, then streams one at random.
  • gen <model> <prompt> and gen <prompt>: Uses an AI backend (OpenAI DALL-E, Gemini) to generate an image from the prompt, then streams it.
  • files: Streams a random image from a pre-loaded list of files (configured at server start).
  • next: In search, video, files, or generate mode, repeats the last operation with a new image or frame.
  • gfx <mode>: Adjusts graphics mode for image transmission (see below).
  • openai-config <param> <value>: Alters AI generation settings at runtime. Supported params:
    • model: e.g., dall-e-3, dall-e-2
    • size: 1024x1024, 1792x1024, 1024x1792
    • quality: standard, hd
    • style: vivid, natural
    • system_prompt: sets the system prompt for image generation.
  • quit: Closes the connection.

Graphics Modes

The protocol supports several display modes; the mode affects how images are encoded and transmitted:

Mode Name Value Resolution Notes
GRAPHICS_8 2 320 x 220 Monochrome dithered
GRAPHICS_9 4 320 x 220 16-level grayscale
VBXE 16 640 x 480 (palette) 256-color palette
  • The mode can be changed with the gfx command.

Image Transmission Protocol

After a command that triggers image output, the server streams a binary packet with the following structure:

Standard YAIL Image Packet

Offset Size Description
0 3 bytes Version, e.g., [1, 1, 0]
3 1 byte Graphics mode (see table above)
4 1 byte Memory block type (e.g., 3 for image block)
5 2 bytes Image data size (little-endian)
7 N bytes Image data (packed bits or 4-bit shades)
  • Graphics_8: Sends a dithered monochrome bitmap, packed as bits.
  • Graphics_9: Sends grayscale image, packed as pairs of 4-bit pixels per byte.
  • VBXE: Includes additional palette data (see below).

VBXE (Palette) Mode

Offset Size Description
0 3 bytes Version [1, 4, 0]
3 1 byte Graphics mode (16)
4 1 byte Number of memory blocks (2)
5 1 byte Block type (PALETTE_BLOCK = 0x06)
6 4 bytes Palette size (little-endian)
10 N bytes Palette data (RGB triplets)
... 1 byte Block type (IMAGE_BLOCK = 0x07)
... 4 bytes Image data size (little-endian)
... N bytes Image data (indexed by palette)

Error Reporting

If an error occurs, the server sends an error packet:

Offset Size Description
0 3 bytes Version [1, 4, 0]
3 1 byte Graphics mode
4 1 byte Number of memory blocks (1)
5 1 byte Block type (ERROR_BLOCK = 0xFF)
6 4 bytes Error message length (little-endian)
10 N bytes UTF-8 error message

Text Responses

  • For non-image or status responses, the server replies with text, prefixed by OK: or ERROR: , and terminated with \r\n.
  • Example: OK: Image size set to 1024x1024\r\n

HTTP Detection

If the server detects an HTTP client (request starts with GET, POST, etc.), it replies with:

HTTP/1.1 403 Forbidden
Content-Type: text/plain
Content-Length: 11

Not Allowed

and closes the connection.


Session Modes

The server tracks the current client "mode" (e.g., search, generate, video, files) to apply the next command appropriately, e.g., fetching the next image or regenerating the last prompt.


Example Session

  1. Connect via TCP to server port 5556.
  2. Send: search cats\n
    • Receive: Binary image data (YAIL packet).
  3. Send: next\n
    • Receive: Another random "cat" image.
  4. Send: gfx 16\n
    • Receive: OK: ...
  5. Send: gen-gemini "dog in space"\n
    • Receive: Binary image data.
  6. Send: quit\n
    • Connection closed.

Implementation Notes

  • The server supports runtime configuration via both environment variables and command-line arguments.
  • Image generation is handled via pluggable backends (OpenAI, Gemini).
  • Local images are loaded at startup from a specified directory or list.
  • The server supports multiple simultaneous clients, each with independent state.

Extending the Protocol

  • New commands can be added for additional image sources or AI models.
  • Graphics modes and packet formats can be extended for new hardware or color depths.

See Also

Clone this wiki locally