Ray Production Applications

This repository contains production-grade applications built using Ray, a distributed computing framework for Python.

Project Structure

text_ml.py: Main implementation file containing Ray-based text processing/ML functionality
text_ml_client.py: Client code to interact with the Ray application
serve_config.yaml: Configuration file for Ray Serve deployment
__init__.py: Package initialization file. This is must otherwise you will keep on getting ModuleNotFound Error

Prerequisites

Python 3.9+
Ray (latest version)

Setup Instructions

Create and activate the conda environment:

conda create -n ray_examples_env python=3.9
conda activate ray_examples_env

Install dependencies:

pip install -r requirements.txt

Installation

Clone the repository
Install dependencies:

pip install ray

Deployment

CPU Deployment

To deploy the application using Ray Serve:

Build the deployment configuration:

serve build text_ml:app -o serve_config.yaml

Start the Ray cluster:

ray start --head

Deploy the application:

serve deploy serve_config.yaml

Check deployment status:

serve status

Run the client:

python text_ml_client.py

Note: `serve status` should be in the `RUNNING` and individual services are in healthy state to get the output.

CPU Autoscaling Deployment

Build the deployment configuration with GPU autoscaling:

serve build text_ml:app -o serve_config_cpu_autoscalling.yaml

Start the Ray cluster:

ray start --head

Deploy with CPU autoscaling:

serve deploy serve_config_cpu_autoscalling.yaml

Check deployment status:

serve status

Monitor the deployment in Ray Dashboard:
Monitor the deployment in Ray Dashboard:

ray dashboard

Authentication Configuration

To deploy and use the application with authentication:

Deploy the auth-enabled configuration:

serve deploy serve_config_auth_cpu_autoscalling.yaml

Check deployment status:

serve status

Run the auth-enabled client:

python text_ml_auth_client.py

The auth-enabled deployment uses the following configuration:

API Key Validation:
- API keys are validated against a predefined list in text_ml_auth.py
- Only valid API keys can access the service
Authentication Flow:
- Clients must include their API key in the request headers
- The API key is validated before processing the request
- Rate limits are enforced per API key
- TODO: Add in memory database, rate limiting, and rate limiting per API key

Load Testing

To test the autoscaling behavior:

Install k6:

brew install k6

Run the load test:

k6 run loadtest.js

Monitor the autoscaling behavior:
- Use serve status to check replica counts
- View the Ray Dashboard at http://localhost:8265
- Check the number of replicas scaling up and down based on load

The load test will help verify that the autoscaling configuration is working correctly, with replicas being added and removed based on the incoming request load.

Development Commands

Debugging

To run the application with debugging enabled:

ray debug text_ml.py

Contributing

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a Pull Request

Acknowledgments

Ray Core Team
Ray Serve Team
Ray ML Team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ray Production Applications

Project Structure

Prerequisites

Setup Instructions

Installation

Deployment

CPU Deployment

Note: `serve status` should be in the `RUNNING` and individual services are in healthy state to get the output.

CPU Autoscaling Deployment

Authentication Configuration

Load Testing

Development Commands

Debugging

Contributing

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LLMServeProd		LLMServeProd
authfeature		authfeature
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
loadtest.js		loadtest.js
requirements.txt		requirements.txt
serve_config.yaml		serve_config.yaml
serve_config_auth_cpu_autoscalling.yaml		serve_config_auth_cpu_autoscalling.yaml
serve_config_cpu_autoscalling.yaml		serve_config_cpu_autoscalling.yaml
text_ml.py		text_ml.py
text_ml_auth.py		text_ml_auth.py
text_ml_auth_client.py		text_ml_auth_client.py
text_ml_client.py		text_ml_client.py

jalajthanaki/RayServeProdDemo

Folders and files

Latest commit

History

Repository files navigation

Ray Production Applications

Project Structure

Prerequisites

Setup Instructions

Installation

Deployment

CPU Deployment

Note: serve status should be in the RUNNING and individual services are in healthy state to get the output.

CPU Autoscaling Deployment

Authentication Configuration

Load Testing

Development Commands

Debugging

Contributing

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Note: `serve status` should be in the `RUNNING` and individual services are in healthy state to get the output.

Packages