Skip to content

Commit 9dfe18d

Browse files
author
Adi
committed
Merge branch 'development' of github.com:v3io/tutorials
# Conflicts: # demos/image-classification/README.md # getting-started/dask-cluster.ipynb
2 parents 38ae365 + 95932f5 commit 9dfe18d

File tree

14 files changed

+3530
-2120
lines changed

14 files changed

+3530
-2120
lines changed

README.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,11 @@
88
- [Deploying Models to Production](#deploying-models-to-production)
99
- [Visualization, Monitoring, and Logging](#visualization-monitoring-and-logging)
1010
- [End-to-End Use-Case Applications](#end-to-end-use-case-applications)
11-
- [Smart Stock Trading](demos/stocks/01-gen-demo-data.ipynb)
11+
- [Image Classification](demos/image-classification/01-image-classification.ipynb)
1212
- [Predictive Infrastructure Monitoring](demos/netops/01-generator.ipynb)
13-
- [Image Recognition](demos/image-classification/keras-cnn-dog-or-cat-classification.ipynb)
1413
- [Natural Language Processing (NLP)](demos/nlp/nlp-example.ipynb)
1514
- [Stream Enrichment](demos/stream-enrich/stream-enrich.ipynb)
15+
- [Smart Stock Trading](demos/stocks/01-gen-demo-data.ipynb)
1616
- [Jupyter Notebook Basics](#jupyter-notebook-basics)
1717
- [Creating Virtual Environments in Jupyter Notebook](#creating-virtual-environments-in-jupyter-notebook)
1818
- [Updating the Tutorial Notebooks](#update-notebooks)
@@ -28,11 +28,12 @@ The Iguazio Data Science Platform (**"the platform"**) is a fully integrated and
2828
The platform incorporates the following components:
2929

3030
- A data science workbench that includes Jupyter Notebook, integrated analytics engines, and Python packages
31-
- Real-time dashboards based on Grafana
31+
- Model management with experiments tracking and automated pipeline capabilities
3232
- Managed data and machine-learning (ML) services over a scalable Kubernetes cluster
3333
- A real-time serverless functions framework — Nuclio
3434
- An extremely fast and secure data layer that supports SQL, NoSQL, time-series databases, files (simple objects), and streaming
3535
- Integration with third-party data sources such as Amazon S3, HDFS, SQL databases, and streaming or messaging protocols
36+
- Real-time dashboards based on Grafana
3637

3738
<br><img src="assets/images/igz-self-service-platform.png" alt="Self-service data science platform" width="650"/><br>
3839

@@ -115,7 +116,7 @@ When your model is ready, you can train it in Jupyter Notebook or by using scala
115116
You can find model-training examples in the platform's tutorial Jupyter notebooks:
116117

117118
- The [NetOps demo](demos/netops/03-training.ipynb) tutorial demonstrates predictive infrastructure-monitoring using scikit-learn.
118-
- The [image-classification demo](demos/image-classification/infer.ipynb) tutorial demonstrates image recognition using TensorFlow and Keras.
119+
- The [image-classification demo](demos/image-classification/01-image-classification.ipynb) tutorial demonstrates image recognition using TensorFlow and Horovod with MLRun.
119120

120121
If you're are a beginner, you might find the following ML guide useful &mdash; [Machine Learning Algorithms In Layman's Terms](https://towardsdatascience.com/machine-learning-algorithms-in-laymans-terms-part-1-d0368d769a7b).
121122

@@ -165,11 +166,11 @@ For information on how to create Grafana dashboards to monitor and visualize dat
165166
Iguazio provides full end-to-end use-case applications that demonstrate how to use the Iguazio Data Science Platform and related tools to address data science requirements for different industries and implementations.
166167
The applications are provided in the **demos** directory of the platform's tutorial Jupyter notebooks and cover the following use cases; for more detailed descriptions, see the demos README ([notebook](demos/README.ipynb) / [Markdown](demos/README.md)):
167168

168-
- <a id="stocks-use-case-app"></a>**Smart stock trading** ([**stocks**](demos/stocks/read-stocks.ipynb)) &mdash; the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualizing the data on a Grafana dashboard.
169+
- <a id="image-recog-use-case-app"></a>**Image recognition** ([**image-classification**](demos/image-classification/01-image-classification.ipynb)) &mdash; the application builds and trains an ML model that identifies (recognizes) and classifies images by using Keras, TensorFlow, and scikit-learn.
169170
- <a id="netops-use-case-app"></a>**Predictive infrastructure monitoring** ([**netops**](demos/netops/01-generator.ipynb)) &mdash; the application builds, trains, and deploys a machine-learning model for analyzing and predicting failure in network devices as part of a network operations (NetOps) flow. The goal is to identify anomalies for device metrics &mdash; such as CPU, memory consumption, or temperature &mdash; which can signify an upcoming issue or failure.
170-
- <a id="image-recog-use-case-app"></a>**Image recognition** ([**image-classification**](demos/image-classification/keras-cnn-dog-or-cat-classification.ipynb)) &mdash; the application builds and trains an ML model that identifies (recognizes) and classifies images by using Keras, TensorFlow, and scikit-learn.
171171
- <a id="nlp-use-case-app"></a>**Natural language processing (NLP)** ([**nlp**](demos/nlp/nlp-example.ipynb)) &mdash; the application processes natural-language textual data &mdash; including spelling correction and sentiment analysis &mdash; and generates a Nuclio serverless function that translates any given text string to another (configurable) language.
172172
- <a id="stream-enrich-use-case-app"></a>**Stream enrichment** ([**stream-enrich**](demos/stream-enrich/stream-enrich.ipynb)) &mdash; the application demonstrates a typical stream-based data-engineering pipeline, which is required in many real-world scenarios: data is streamed from an event streaming engine; the data is enriched, in real time, using data from a NoSQL table; the enriched data is saved to an output data stream and then consumed from this stream.
173+
- <a id="stocks-use-case-app"></a>**Smart stock trading** ([**stocks**](demos/stocks/read-stocks.ipynb)) &mdash; the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualizing the data on a Grafana dashboard.
173174

174175
<a id="jupyter-notebook-basics"></a>
175176
## Jupyter Notebook Basics

demos/README.ipynb

Lines changed: 29 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,11 @@
1414
"**In This Document**\n",
1515
"\n",
1616
"- [Overview](#overview)\n",
17-
"- [Stock Trading](#stocks-demo)\n",
17+
"- [Image Classification](#image-classification-demo)\n",
1818
"- [Predictive Infrastructure Monitoring](#netops-demo)\n",
19-
"- [Image Recognition](#image-classification-demo)\n",
2019
"- [Natural Language Processing (NLP)](#nlp-demo)\n",
21-
"- [Stream Enrichment](#stream-enrich-demo)"
20+
"- [Stream Enrichment](#stream-enrich-demo)\n",
21+
"- [Stock Trading](#stocks-demo)"
2222
]
2323
},
2424
{
@@ -35,16 +35,20 @@
3535
"cell_type": "markdown",
3636
"metadata": {},
3737
"source": [
38-
"<a id=\"stocks-demo\"></a>\n",
39-
"## Smart Stock Trading\n",
38+
"<a id=\"image-classification-demo\"></a>\n",
39+
"## Image Classification\n",
4040
"\n",
41-
"The [**stocks**](stocks/01-gen-demo-data.ipynb) demo demonstrates a smart stock-trading application: \n",
42-
"the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualizing the data on a Grafana dashboard.\n",
41+
"The [**image-classification**](image-classification/01-image-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images.\n",
4342
"\n",
44-
"- The stock data is read from Twitter by using the [TwythonStreamer](https://twython.readthedocs.io/en/latest/usage/streaming_api.html) Python wrapper to the Twitter Streaming API, and saved to TSDB and NoSQL tables in the platform.\n",
45-
"- Sentiment analysis is done by using the [TextBlob](https://textblob.readthedocs.io/) Python library for natural language processing (NLP).\n",
46-
"- The analyzed data is visualized as graphs on a [Grafana](https://grafana.com/grafana) dashboard, which is created from the Jupyter notebook code.\n",
47-
" The data is read from both the TSDB and NoSQL stock tables."
43+
"This example is using TensorFlow, Horovod, and Nuclio demonstrating end to end solution for image classification, \n",
44+
"it consists of 4 MLRun and Nuclio functions:\n",
45+
"\n",
46+
"1. import an image archive from S3 to the cluster file system\n",
47+
"2. Tag the images based on their name structure \n",
48+
"3. Distrubuted training using TF, Keras and Horovod\n",
49+
"4. Automated deployment of Nuclio model serving function (form [Notebook](nuclio-serving-tf-images.ipynb) and from [Dockerfile](./inference-docker))\n",
50+
"\n",
51+
"The Example also demonstrate an [automated pipeline](mlrun_mpijob_pipe.ipynb) using MLRun and KubeFlow pipelines "
4852
]
4953
},
5054
{
@@ -67,28 +71,29 @@
6771
"cell_type": "markdown",
6872
"metadata": {},
6973
"source": [
70-
"<a id=\"image-classification-demo\"></a>\n",
71-
"## Image Recognition\n",
74+
"<a id=\"nlp-demo\"></a>\n",
75+
"## Natural Language Processing (NLP)\n",
7276
"\n",
73-
"The [**image-classification**](image-classification/keras-cnn-dog-or-cat-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images.\n",
77+
"The [**nlp**](nlp/nlp-example.ipynb) demo demonstrates natural language processing (NLP): the application processes natural-language textual data &mdash; including spelling correction and sentiment analysis &mdash; and generates a Nuclio serverless function that translates any given text string to another (configurable) language.\n",
7478
"\n",
75-
"- The data is collected by downloading images of dogs and cats from the Iguazio sample data-set AWS bucket.\n",
76-
"- The training data for the ML model is prepared by using [pandas](https://pandas.pydata.org/) DataFrames to build a predecition map.\n",
77-
" The data is visualized by using the [Matplotlib](https://matplotlib.org/) Python library.\n",
78-
"- An image recognition and classification ML model that identifies the animal type is built and trained by using [Keras](https://keras.io/), [TensorFlow](https://www.tensorflow.org/), and [scikit-learn](https://scikit-learn.org) (a.k.a. sklearn)."
79+
"- The textual data is collected and processed by using the [TextBlob](https://textblob.readthedocs.io/) Python NLP library. The processing includes spelling correction and sentiment analysis.\n",
80+
"- A serverless function that translates text to another language, which is configured in an environment variable, is generated by using the [Nuclio](https://nuclio.io/) framework."
7981
]
8082
},
8183
{
8284
"cell_type": "markdown",
8385
"metadata": {},
8486
"source": [
85-
"<a id=\"nlp-demo\"></a>\n",
86-
"## Natural Language Processing (NLP)\n",
87+
"<a id=\"stocks-demo\"></a>\n",
88+
"## Smart Stock Trading\n",
8789
"\n",
88-
"The [**nlp**](nlp/nlp-example.ipynb) demo demonstrates natural language processing (NLP): the application processes natural-language textual data &mdash; including spelling correction and sentiment analysis &mdash; and generates a Nuclio serverless function that translates any given text string to another (configurable) language.\n",
90+
"The [**stocks**](stocks/01-gen-demo-data.ipynb) demo demonstrates a smart stock-trading application: \n",
91+
"the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualizing the data on a Grafana dashboard.\n",
8992
"\n",
90-
"- The textual data is collected and processed by using the [TextBlob](https://textblob.readthedocs.io/) Python NLP library. The processing includes spelling correction and sentiment analysis.\n",
91-
"- A serverless function that translates text to another language, which is configured in an environment variable, is generated by using the [Nuclio](https://nuclio.io/) framework."
93+
"- The stock data is read from Twitter by using the [TwythonStreamer](https://twython.readthedocs.io/en/latest/usage/streaming_api.html) Python wrapper to the Twitter Streaming API, and saved to TSDB and NoSQL tables in the platform.\n",
94+
"- Sentiment analysis is done by using the [TextBlob](https://textblob.readthedocs.io/) Python library for natural language processing (NLP).\n",
95+
"- The analyzed data is visualized as graphs on a [Grafana](https://grafana.com/grafana) dashboard, which is created from the Jupyter notebook code.\n",
96+
" The data is read from both the TSDB and NoSQL stock tables."
9297
]
9398
},
9499
{
@@ -128,5 +133,5 @@
128133
}
129134
},
130135
"nbformat": 4,
131-
"nbformat_minor": 2
136+
"nbformat_minor": 4
132137
}

demos/README.md

Lines changed: 25 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,33 @@
1-
21
# End-to-End Platform Use-Case Application Demos
32

43
**In This Document**
54

65
- [Overview](#overview)
7-
- [Stock Trading](#stocks-demo)
6+
- [Image Classification](#image-classification-demo)
87
- [Predictive Infrastructure Monitoring](#netops-demo)
9-
- [Image Recognition](#image-classification-demo)
108
- [Natural Language Processing (NLP)](#nlp-demo)
119
- [Stream Enrichment](#stream-enrich-demo)
10+
- [Stock Trading](#stocks-demo)
1211

1312
<a id="overview"></a>
1413
## Overview
1514

1615
The **demos** tutorials directory contains full end-to-end use-case applications that demonstrate how to use the Iguazio Data Science Platform ("the platform") and related tools to address data science requirements for different industries and implementations.
1716

18-
<a id="stocks-demo"></a>
19-
## Smart Stock Trading
17+
<a id="image-classification-demo"></a>
18+
## Image Classification
2019

21-
The [**stocks**](stocks/read-stocks.ipynb) demo demonstrates a smart stock-trading application:
22-
the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualizing the data on a Grafana dashboard.
20+
The [**image-classification**](image-classification/01-image-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images.
2321

24-
- The stock data is read from Twitter by using the [TwythonStreamer](https://twython.readthedocs.io/en/latest/usage/streaming_api.html) Python wrapper to the Twitter Streaming API, and saved to TSDB and NoSQL tables in the platform.
25-
- Sentiment analysis is done by using the [TextBlob](https://textblob.readthedocs.io/) Python library for natural language processing (NLP).
26-
- The analyzed data is visualized as graphs on a [Grafana](https://grafana.com/grafana) dashboard, which is created from the Jupyter notebook code.
27-
The data is read from both the TSDB and NoSQL stock tables.
22+
This example is using TensorFlow, Horovod, and Nuclio demonstrating end to end solution for image classification,
23+
it consists of 4 MLRun and Nuclio functions:
24+
25+
1. import an image archive from S3 to the cluster file system
26+
2. Tag the images based on their name structure
27+
3. Distrubuted training using TF, Keras and Horovod
28+
4. Automated deployment of Nuclio model serving function (form [Notebook](nuclio-serving-tf-images.ipynb) and from [Dockerfile](./inference-docker))
29+
30+
The Example also demonstrate an [automated pipeline](mlrun_mpijob_pipe.ipynb) using MLRun and KubeFlow pipelines
2831

2932
<a id="netops-demo"></a>
3033
## Predictive Infrastructure Monitoring
@@ -37,16 +40,6 @@ The goal is to identify anomalies for device metrics &mdash; such as CPU, memory
3740
- The data is generated by using an open-source generator tool that was written by Iguazio.
3841
This generator enables users to customize the metrics, data range, and many other parameters, and prepare a data set that's suitable for other similar use cases.
3942

40-
<a id="image-classification-demo"></a>
41-
## Image Recognition
42-
43-
The [**image-classification**](image-classification/keras-cnn-dog-or-cat-classification.ipynb) demo demonstrates image recognition: the application builds and trains an ML model that identifies (recognizes) and classifies images.
44-
45-
- The data is collected by downloading images of dogs and cats from the Iguazio sample data-set AWS bucket.
46-
- The training data for the ML model is prepared by using [pandas](https://pandas.pydata.org/) DataFrames to build a predecition map.
47-
The data is visualized by using the [Matplotlib](https://matplotlib.org/) Python library.
48-
- An image recognition and classification ML model that identifies the animal type is built and trained by using [Keras](https://keras.io/), [TensorFlow](https://www.tensorflow.org/), and [scikit-learn](https://scikit-learn.org) (a.k.a. sklearn).
49-
5043
<a id="nlp-demo"></a>
5144
## Natural Language Processing (NLP)
5245

@@ -55,6 +48,17 @@ The [**nlp**](nlp/nlp-example.ipynb) demo demonstrates natural language processi
5548
- The textual data is collected and processed by using the [TextBlob](https://textblob.readthedocs.io/) Python NLP library. The processing includes spelling correction and sentiment analysis.
5649
- A serverless function that translates text to another language, which is configured in an environment variable, is generated by using the [Nuclio](https://nuclio.io/) framework.
5750

51+
<a id="stocks-demo"></a>
52+
## Smart Stock Trading
53+
54+
The [**stocks**](stocks/01-gen-demo-data.ipynb) demo demonstrates a smart stock-trading application:
55+
the application reads stock-exchange data from an internet service into a time-series database (TSDB); uses Twitter to analyze the market sentiment on specific stocks, in real time; and saves the data to a platform NoSQL table that is used for generating reports and analyzing and visualizing the data on a Grafana dashboard.
56+
57+
- The stock data is read from Twitter by using the [TwythonStreamer](https://twython.readthedocs.io/en/latest/usage/streaming_api.html) Python wrapper to the Twitter Streaming API, and saved to TSDB and NoSQL tables in the platform.
58+
- Sentiment analysis is done by using the [TextBlob](https://textblob.readthedocs.io/) Python library for natural language processing (NLP).
59+
- The analyzed data is visualized as graphs on a [Grafana](https://grafana.com/grafana) dashboard, which is created from the Jupyter notebook code.
60+
The data is read from both the TSDB and NoSQL stock tables.
61+
5862
<a id="stream-enrich-demo"></a>
5963
### Stream Enrichment
6064

demos/gpu/README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
21
# GPU Demos
32

43
- [Overview](#gpu-demos-overview)
@@ -15,13 +14,16 @@ The **demos/gpu** directory includes the following:
1514
- A **horovod** directory with applications that use Uber's [Horovod](https://eng.uber.com/horovod/) distributed deep-learning framework, which can be used to convert a single-GPU TensorFlow, Keras, or PyTorch model-training program to a distributed program that trains the model simultaneously over multiple GPUs.
1615
The objective is to speed up your model training with minimal changes to your existing single-GPU code and without complicating the execution.
1716
Horovod code can also run over CPUs with only minor modifications.
17+
For more information and examples, see the [Horovod GitHub repository](https://github.com/horovod/horovod).
18+
1819
The Horovod tutorials include the following:
1920

2021
- An image-recognition demo application for execution over GPUs (**image-classification**).
2122
- A slightly modified version of the GPU image-classification demo application for execution over CPUs (**cpu/image-classification**).
2223
- Benchmark tests (**benchmark-tf.ipynb**, which executes **tf_cnn_benchmarks.py**).
2324

2425
- A **rapids** directory with applications that use NVIDIA's [RAPIDS](https://rapids.ai/) open-source libraries suite for executing end-to-end data science and analytics pipelines entirely on GPUs.
26+
2527
The RAPIDS tutorials include the following:
2628

2729
- Demo applications that use the [cuDF](https://rapidsai.github.io/projects/cudf/en/latest/index.html) RAPIDS GPU DataFrame library to perform batching and aggregation of data that's read from a Kafaka stream, and then write the results to a Parquet file.<br>

0 commit comments

Comments
 (0)