Optional. This guide is for the Petals distributed-inference demo (server + client, text generation). It does not cover the main product — the browser LLM State Viz (hidden-state 3D trajectory). For that, run uvicorn viz_server:app --reload and open http://localhost:8000. Petals does not expose hidden states to the client.
This guide will walk you through creating a video or animated GIF demonstration of the Enhanced Petals for Apple Silicon project.
To get that "Disney-like" animation or an MP4, you'll want to record your screen while performing the steps below. Here are some popular tools:
- macOS: QuickTime Player (comes built-in; File > New Screen Recording)
- Cross-platform: OBS Studio (free, open-source, and powerful)
- Other tools: Kap (GIFs on macOS) After recording, you can edit the footage to add annotations, zoom in on important details, and create a polished, distributable video.
Follow these steps to showcase the key features of your project.
- Clone the Repository: If you haven't already, clone your project:
# Replace 'terrafying' with your actual GitHub username/organization if different git clone https://github.com/terrafying/petals-2-metal new-patella && cd new-patella
- Install Dependencies: Ensure all dependencies are installed:
pip install -r requirements.txt
- Open a new terminal window.
- Navigate to your project directory (
petals-metal). - Run the local server script:
./run-local.sh
- What to look for (and highlight in your recording):
- Messages about system resource checks.
- Logs indicating detection of existing Petals swarms or creation of a new one.
- Confirmation that the server is using MPS (Metal Performance Shaders) acceleration.
- The server advertising itself on the local network.
- Open a second new terminal window (keep the server running in the first one).
- Navigate to your project directory (
petals-metal). - Run the
demo_client.pyscript:python demo_client.py
- What to look for (and highlight in your recording):
- The client successfully connecting to the swarm/server.
- The generated text output (e.g., "Hello, how are you?" followed by the model's response).
- The client gracefully closing the connection.
Once you have your screen recording, consider these to enhance it:
- Speed up an_parts: Condense parts where you're just typing or waiting for processes.
- Zoom and Pan: Focus on key terminal outputs or code sections.
- Annotations/Text Overlays: Explain what's happening or highlight important messages from the scripts.
- Clear Narration or Background Music: Depending on your style.
Good luck creating your distributable masterpiece!
This scenario helps demonstrate the "hidden symmetry" of your system – its ability to self-organize from the "chaos" of individual nodes appearing on the network. It showcases the zero-config discovery and auto-swarm formation features.
Objective: To show multiple server nodes discovering each other, forming a swarm, and a client connecting to this distributed intelligence.
- You'll ideally need two machines on the same local network to run two Petals server nodes simultaneously.
- If running on a single machine, you would need to ensure your
run-local.shscript and Petals configuration can support multiple instances (e.g., by using different ports or other isolating mechanisms). This guide assumes the multi-machine setup for clarity, adapt as needed for a single-machine simulation.
-
Start the First Server Node:
- On
Machine A(or your first terminal instance):# In your petals-metal directory ./run-local.sh - Observe: Note the logs indicating it has started, potentially creating a new swarm if it's the first one.
- On
-
Start the Second Server Node:
- On
Machine B(or your second terminal instance, configured for a separate server):# In your petals-metal directory ./run-local.sh - Observe (The "Symmetry" Emerges):
- Watch the logs on both
Machine AandMachine B. - You should see messages related to mDNS/Bonjour service discovery.
- Look for indications that
Machine Bhas discovered the swarm created byMachine A(or vice-versa ifMachine Bwas slightly faster to fully initialize its swarm logic) and has joined it. - This is the "hidden symmetry": nodes automatically finding each other and organizing into a functional unit.
- Watch the logs on both
- On
-
Connect the Client to the Swarm:
- On a third machine, or one of the server machines (if its resources allow and it doesn't interfere with the server), run the client:
# In your petals-metal directory python demo_client.py - Observe:
- The client should connect to the swarm. It doesn't need to know which specific server to talk to initially; the swarm handles distributing the work.
- The client successfully generates text, demonstrating the collective power of the organized swarm.
- On a third machine, or one of the server machines (if its resources allow and it doesn't interfere with the server), run the client:
- Zero-Configuration Discovery: No manual IP addresses or port configurations were needed for the servers to find each other.
- Automatic Swarm Formation: The nodes intelligently formed a cooperative group.
- Distributed Service: The client can leverage the combined resources of the swarm.
- Robustness (Conceptual): While harder to demo predictably without more sophisticated tooling, this setup is the foundation for a system that can be resilient to nodes joining or leaving. You can narrate how, if one server node were to drop (gracefully), the client or new clients could still potentially be serviced by the remaining nodes.
This advanced demonstration really brings to life the "enhanced" part of your Petals fork, focusing on the intelligent local networking and self-organization capabilities.