Skip to content

Latest commit

 

History

History
69 lines (57 loc) · 1.33 KB

README.md

File metadata and controls

69 lines (57 loc) · 1.33 KB

Docker hadoop yarn cluster for spark 2.4.1

Provides Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn.

Usage

Build

make build

Run

make start

Stop

make stop

Connect to Master Node

make connect
 ---- MASTER NODE ---- 
root@cluster-master:/#

Run spark applications on cluster :

Once connected to the master node

spark-shell

spark-shell --master yarn --deploy-mode client

spark submit

spark-submit --master yarn --deploy-mode [client or cluster] --num-executors 2 --executor-memory 4G --executor-cores 4 --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/jars/spark-examples_2.11-2.4.1.jar

Web UI

  • Get master node ip:
make master-ip
 ---- MASTER NODE IP ---- 
Master node ip : 172.20.0.4
  • Access to Hadoop cluster Web UI : master-node-ip:8088
  • Access to spark Web UI : master-node-ip:8080
  • Access to hdfs Web UI : master-node-ip:50070