You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: site/content/howto/spark.markdown
+2
Original file line number
Diff line number
Diff line change
@@ -10,6 +10,8 @@ weight: 1
10
10
{{%exurl "Apache Spark""https://spark.apache.org/"%}} is a powerful open-source processing engine built around speed, ease of use, and sophisticated analytics, with APIs in Java, Scala, Python, R, and SQL.
11
11
Spark runs programs up to 100x faster than Apache Hadoop MapReduce in memory, or 10x faster on disk.
12
12
It can be used to build data applications as a library, or to perform ad-hoc data analysis interactively.
13
+
14
+
13
15
Spark powers a stack of libraries including SQL, DataFrames, and Datasets, MLlib for machine learning, GraphX for graph processing, and Spark Streaming. You can combine these libraries seamlessly in the same application.
14
16
As well, Spark runs on a laptop, Apache Hadoop, Apache Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Apache Cassandra, Apache HBase, and S3.
0 commit comments