Skip to content

Commit 72d4845

Browse files
authored
Metaflow dev stack (#144)
* add a local dev stack doc * improve dev stack docs * fix a typo
1 parent 54e60ff commit 72d4845

File tree

5 files changed

+90
-21
lines changed

5 files changed

+90
-21
lines changed

docs/getting-started/devstack.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
import ReactPlayer from 'react-player'
2+
3+
# Setting Up the Dev Stack
4+
5+
You can start writing and running flows just by installing Metaflow locally with
6+
`pip install metaflow`. However, its true power lies in its integration with underlying
7+
infrastructure, which allows you to
8+
9+
- [run tasks in the cloud at any scale](/scaling/remote-tasks/introduction),
10+
- [visualize and observe them in a UI](/metaflow/visualizing-results),
11+
- [deploy them in a highly available production orchestrator](/production/introduction),
12+
- and compose reactive systems with [event-triggered flows](/production/event-triggering).
13+
14+
All of these features require an infrastructure stack that needs to be configured to work
15+
with Metaflow. In production settings, this infrastructure runs in your cloud account -
16+
as [described on this page](/getting-started/infrastructure) - but you may want to test the
17+
full stack first locally.
18+
19+
Metaflow comes with a one-click script, `metaflow-dev`, which sets up a complete
20+
development stack for you locally on top of [Minikube](https://minikube.sigs.k8s.io/docs/),
21+
including a local metadata service and a database, and [Metaflow UI](https://github.com/Netflix/metaflow-ui).
22+
The stack allows you to [test scaling with `@kubernetes`](/scaling/remote-tasks/kubernetes),
23+
[deployment on Argo Workflows](/production/scheduling-metaflow-flows/scheduling-with-argo-workflows),
24+
as well as [event-triggering](/production/event-triggering).
25+
26+
## When to use `metaflow-dev`
27+
28+
The `metaflow-dev` stack comes in handy in a few scenarios:
29+
30+
1. It allows you to **test the full functionality of Metaflow** before [deploying it in your cloud account](/getting-started/infrastructure).
31+
32+
2. You can use it **in your CI/CD workflows to test flows** in a fully isolated, ephemeral environment.
33+
34+
3. If you want to **contribute extensions for Metaflow**, or make changes in the core Metaflow, the stack
35+
provides you a complete development and testing environment.
36+
37+
## How to set up the dev stack
38+
39+
Setting up the stack is straightforward:
40+
41+
1. Install Metaflow with `pip install metaflow`.
42+
2. Ensure that [you have Docker installed](https://docs.docker.com/desktop/).
43+
3. Run `metaflow-dev up`.
44+
45+
The `metaflow-dev` command downloads and installs Minikube. After this, it uses [Tilt](https://tilt.dev/) to deploy
46+
and expose [all components required by Metaflow](/internals/technical-overview) inside Minikube.
47+
48+
After the deployment completes, leave the shell running `metaflow-dev up` open, as it hosts necessary port
49+
forwardings. On the side, open a new shell and execute
50+
`metaflow-dev shell`. This will open a session with a Metaflow configuration pointing at the local stack.
51+
You can now use the shell to develop, run, and deploy Metaflow flows!
52+
53+
You can navigate to the Tilt UI, linked in the console output, to find links to the Metaflow and Argo Workflows UIs.
54+
You can find direct links to the UI in the Metaflow output as well.
55+
56+
### The dev stack in action
57+
58+
Watch this short video (no sound) for a quick setup-to-usage walkthrough:
59+
60+
<ReactPlayer controls url="https://www.youtube.com/watch?v=nPtqj72hfKU" />
61+
<br/>
62+
63+
The video covers:
64+
65+
- Setting up the dev stack
66+
- Observing the stack through the Tilt UI
67+
- Using the stack to run and monitor runs
68+
- Running at scale with `@kubernetes`
69+
- Inspecting results in a notebook, accessing metadata
70+
- Deploying to Argo Workflows
71+
- Tearing down the stack
72+
73+
## Using the dev stack in a CI/CD pipeline
74+
75+
The dev stack is lightweight enough to run in small CI/CD worker nodes, including those provided by GitHub Actions. You
76+
can use the stack to run integration tests for flows in a fully isolated, ephemeral environment.
77+
78+
Take a look at [this example repository](https://github.com/outerbounds/gha-metaflow/) and
79+
[a GitHub Actions config](https://github.com/outerbounds/gha-metaflow/blob/main/.github/workflows/metaflow.yml) for
80+
a template that you can easily apply in your own setup.
81+

docs/getting-started/infrastructure.md

Lines changed: 4 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,10 @@
11

22
# Deploying Infrastructure for Metaflow
33

4-
While you can [get started with Metaflow easily](/getting-started/install) on your
5-
laptop, the main benefits of Metaflow lie in its ability to [scale out to external
6-
compute clusters](/scaling/introduction) and to [deploy to production-grade workflow
7-
orchestrators](/production/introduction). To benefit from these features, you need to
8-
configure Metaflow and the infrastructure behind it appropriately. A separate guide,
9-
[Metaflow Resources for Engineers](https://docs.outerbounds.com/engineering/welcome/) covers
10-
everything related to such deployments. This page provides a quick overview.
4+
Use [the local dev stack](/getting-started/devstack) to explore how Metaflow integrates
5+
with underlying infrastructure. When you are ready for a production deployment, you will need
6+
to set up infrastructure in your own cloud account, as detailed on this page. For further
7+
information, see [Metaflow Resources for Engineers](https://docs.outerbounds.com/engineering/welcome/).
118

129
## Supported infrastructure components
1310

@@ -16,13 +13,6 @@ Since modern data science / ML applications are powered by a number of interconn
1613
illustrated below ([Why? See here](/introduction/why-metaflow)). You can see logos of
1714
all supported systems which you can use to enable each layer.
1815

19-
Consider this illustration as a menu that allows you to build your own pizza: You get to
20-
customize your own crust, sauce, toppings, and cheese. You can make the choices based on
21-
your existing business infrastructure and the requirements and preferences of your
22-
organization. Fortunately, Metaflow provides a consistent API for all these
23-
combinations, so you can even change the choices later without having to rewrite your
24-
flows.
25-
2616
<object style={{width: 700}} type="image/svg+xml"
2717
data="/assets/infra-stack.svg"></object>
2818

@@ -193,8 +183,3 @@ This stack incurs a typical maintenance overhead of an GKE-based Kubernetes clus
193183
which shouldn't add much burden if your organization uses GKE already.
194184

195185

196-
---
197-
198-
If you are unsure about the stacks, just run `pip install metaflow` to install the local
199-
stack and move on to [the tutorials](/getting-started/tutorials). Flows you create will
200-
work without changes on any of these stacks.

docs/getting-started/install.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,9 @@ Sandbox](https://docs.outerbounds.com/sandbox/).
1616

1717
:::
1818

19-
20-
Now you are ready to get your hands dirty with the [Tutorials](tutorials/).
19+
Now you are ready to get your hands dirty with the [Tutorials](tutorials/). Or, if you want
20+
to take a step further and test the full power of Metaflow, you can [easily setup a
21+
Minikube-based dev stack](/getting-started/devstack) locally.
2122

2223
## Upgrading Metaflow
2324

docs/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ Metaflow makes it easy to build and manage real-life data science, AI, and ML pr
2121
## Getting Started
2222

2323
- [Installing Metaflow locally](getting-started/install)
24+
- [Setting Up the Dev Stack](getting-started/devstack)*New*
2425
- [Deploying Infrastructure for Metaflow](getting-started/infrastructure)
2526
- [Quickstart Tutorial](getting-started/tutorials/)
2627

sidebars.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ const sidebars = {
3030
label: "Getting Started",
3131
items: [
3232
"getting-started/install",
33+
"getting-started/devstack",
3334
"getting-started/infrastructure",
3435
{
3536
type: "category",

0 commit comments

Comments
 (0)