Skip to content

Commit ee27863

Browse files
committed
test course gen and structure
Signed-off-by: Max Pumperla <max.pumperla@googlemail.com>
1 parent 10aafe5 commit ee27863

File tree

52 files changed

+89891
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+89891
-0
lines changed

courses/workloads/PyTorch_Lightning/00_workload/thumbnail.png renamed to courses/workloads/PyTorch_Lightning/thumbnail.png

File renamed without changes.

courses/workloads/Ray_Data_Batch_Inference/00_workload/00_lesson/lesson.html

Lines changed: 7538 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Batch Inference with Ray Data",
8+
"© 2025, Anyscale. All Rights Reserved",
9+
"",
10+
"💻 **Launch Locally**: You can run this notebook locally.",
11+
"",
12+
"🚀 **Launch on Cloud**: Think about running this notebook on a Ray Cluster (Click [here](http://console.anyscale.com/register) to easily start a Ray cluster on Anyscale)",
13+
"",
14+
"This example shows how to do batch inference with Ray Data.",
15+
"",
16+
"Batch inference with Ray Data enables you to efficiently generate predictions from machine learning models on large datasets by processing multiple data points at once. Instead of running inference on one row at a time, which can be slow and resource-inefficient, batch inference leverages vectorized computation and parallelism to maximize throughput. This is especially useful when working with modern deep learning models, which are optimized for batch processing on CPUs, GPUs, or Apple Silicon devices.",
17+
"",
18+
"The typical workflow begins by loading your dataset—such as a public dataset from Hugging Face—into a Ray Dataset. Ray Data can automatically partition the data for parallel processing, or you can repartition it explicitly to control the number of data blocks. Once the data is loaded, you define a callable class (such as a text embedding model) that loads the machine learning model in its constructor and implements a `__call__` method to process each batch. Ray Data’s `map_batches` API is then used to apply this callable to each batch of data, with options to control concurrency and resource allocation (e.g., number of GPUs).",
19+
"",
20+
"This approach allows you to spin up multiple concurrent model instances, each processing different batches of data in parallel. The result is a significant speedup in inference time, especially for large datasets. After inference, you can materialize the results, inspect the output, and shut down the Ray cluster to free up resources. Batch inference with Ray Data is scalable, flexible, and integrates seamlessly with modern ML workflows, making it a powerful tool for production and research environments alike."
21+
]
22+
},
23+
{
24+
"cell_type": "markdown",
25+
"metadata": {},
26+
"source": [
27+
"### Outline",
28+
"",
29+
"<b>In this notebook, we go through a typical ML batch inference workflow:</b>",
30+
"",
31+
"",
32+
" Architecture",
33+
" Import Libraries",
34+
" Load a public dataset from Hugging Face and move it into Ray Data object store.",
35+
" Batch Inference Class",
36+
" - Create a Ray actor class to load a ML model. In this example, we use SentenceTransformer library from Hugging Face to load a sentence embedding model.",
37+
" Create batches of data to do inference.",
38+
" Deploying at Scale",
39+
" Inference on the entire dataset",
40+
" Out of memory errors",
41+
" Summary",
42+
"</ul>",
43+
"</div>"
44+
]
45+
}
46+
],
47+
"metadata": {
48+
"kernelspec": {
49+
"display_name": "Python 3",
50+
"language": "python",
51+
"name": "python3"
52+
},
53+
"language_info": {
54+
"name": "python",
55+
"version": "3.11.0"
56+
}
57+
},
58+
"nbformat": 4,
59+
"nbformat_minor": 4
60+
}
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
---
2+
theme: seriph
3+
background: /slides_background.png
4+
class: text-center
5+
drawings:
6+
persist:
7+
# slide transition: https://sli.dev/guide/animations.html#slide-transitions
8+
# transition: fade
9+
# enable MDC Syntax: https://sli.dev/features/mdc
10+
mdc: true
11+
# duration of the presentation
12+
duration: 15min
13+
addons:
14+
- fancy-arrow
15+
- slidev-addon-tldraw
16+
- slidev-component-spotlight
17+
- slidev-component-poll
18+
- slidev-addon-typst
19+
---
20+
21+
22+
# Batch Inference with Ray Data
23+
24+
---
25+
26+
# Batch Inference with Ray Data
27+
© 2025, Anyscale. All Rights Reserved
28+
29+
💻 **Launch Locally**: You can run this notebook locally.
30+
31+
🚀 **Launch on Cloud**: Think about running this notebook on a Ray Cluster (Click [here](http://console.anyscale.com/register) to easily start a Ray cluster on Anyscale)
32+
33+
This example shows how to do batch inference with Ray Data.
34+
35+
Batch inference with Ray Data enables you to efficiently generate predictions from machine learning models on large datasets by processing multiple data points at once. Instead of running inference on one row at a time, which can be slow and resource-inefficient, batch inference leverages vectorized computation and parallelism to maximize throughput. This is especially useful when working with modern deep learning models, which are optimized for batch processing on CPUs, GPUs, or Apple Silicon devices.
36+
37+
The typical workflow begins by loading your dataset—such as a public dataset from Hugging Face—into a Ray Dataset. Ray Data can automatically partition the data for parallel processing, or you can repartition it explicitly to control the number of data blocks. Once the data is loaded, you define a callable class (such as a text embedding model) that loads the machine learning model in its constructor and implements a `__call__` method to process each batch. Ray Data’s `map_batches` API is then used to apply this callable to each batch of data, with options to control concurrency and resource allocation (e.g., number of GPUs).
38+
39+
This approach allows you to spin up multiple concurrent model instances, each processing different batches of data in parallel. The result is a significant speedup in inference time, especially for large datasets. After inference, you can materialize the results, inspect the output, and shut down the Ray cluster to free up resources. Batch inference with Ray Data is scalable, flexible, and integrates seamlessly with modern ML workflows, making it a powerful tool for production and research environments alike.
40+
41+
---
42+
43+
### Outline
44+
45+
<b>In this notebook, we go through a typical ML batch inference workflow:</b>
46+
47+
48+
Architecture
49+
Import Libraries
50+
Load a public dataset from Hugging Face and move it into Ray Data object store.
51+
Batch Inference Class
52+
- Create a Ray actor class to load a ML model. In this example, we use SentenceTransformer library from Hugging Face to load a sentence embedding model.
53+
Create batches of data to do inference.
54+
Deploying at Scale
55+
Inference on the entire dataset
56+
Out of memory errors
57+
Summary
58+
</ul>
59+
</div>

0 commit comments

Comments
 (0)