You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ML4all is a system that frees users from the burden of machine learning algorithm selection and low-level implementation details.
5
+
It uses a new abstraction that is capable of solving most ML tasks and provides a cost-based optimizer on top of the proposed abstraction for choosing the best gradient descent algorithm in a given setting.
6
+
Our results show that ML4all is more than two orders of magnitude faster than state-of-the-art systems and can process large datasets that were not possible before.
7
+
8
+
More details can be found in our dedicated [SIGMOD publication](https://dl.acm.org/citation.cfm?id=3064042) and
9
+
in Wayang's core [system paper](https://sigmodrecord.org/publications/sigmodRecord/2309/pdfs/05_Systems_Beedkar.pdf).
10
+
11
+
## Abstraction
12
+
ML4all abstracts most ML algorithms with seven operators:
13
+
14
+
- (1) `Transform` receives a data point to transform
15
+
(e.g., normalize it) and outputs a new data point.
16
+
17
+
- (2) `Stage` initializes all the required global param-
18
+
eters (e.g., centroids for the k-means algorithm).
point. For example, it can compute the nearest cen-
23
+
troid for each input data point.
24
+
25
+
- (4) `Update` updates the global parameters based on
26
+
a user-defined formula. For example, it can update
27
+
the new centroids based on the output computed by
28
+
the Compute operator.
29
+
30
+
- (5) `Sample` takes as input the size of the desired
31
+
sample and the data points to sample from and re-
32
+
turns a reduced set of sampled data points.
33
+
34
+
- (6) `Converge` specifies a function that outputs
35
+
a convergence dataset required for determining
36
+
whether the iterations should continue or stop.
37
+
38
+
- (7) `Loop` specifies the stopping condition on the
39
+
convergence dataset.
40
+
41
+
Similar to MapReduce, where users need to implement a map and reduce function, users of ML4all wishing to develop their own algorithm should implement the above interfaces.
42
+
The interfaces can be found in `org.apache.wayang.ml4all.abstraction.api`.
43
+
44
+
Examples for KMeans clustering and stochastic gradient descent can be found in `org.apache.wayang.ml4all.algorithms`.
0 commit comments