@@ -12,7 +12,7 @@ Metrics can also be printed to stdout.
1212## Recording metrics using the Flight Recorder mode with Stage-level granularity
1313To record metrics at the stage execution level granularity add these configurations to spark-submit:
1414 ```
15- --packages ch.cern.sparkmeasure:spark-measure_2.12 :0.23
15+ --packages ch.cern.sparkmeasure:spark-measure_2.13 :0.25
1616 --conf spark.extraListeners=ch.cern.sparkmeasure.FlightRecorderStageMetrics
1717 ```
1818
@@ -25,7 +25,7 @@ The usage is almost the same as for the stage metrics mode described above, just
2525The configuration parameters applicable to Flight recorder mode for Task granularity are:
2626
2727 ```
28- --packages ch.cern.sparkmeasure:spark-measure_2.12 :0.23
28+ --packages ch.cern.sparkmeasure:spark-measure_2.13 :0.25
2929 --conf spark.extraListeners=ch.cern.sparkmeasure.FlightRecorderTaskMetrics
3030 ```
3131
@@ -51,7 +51,7 @@ A Python example
5151 - This runs the pi.py example script
5252 - collects and saves the metrics to ` /tmp/stageMetrics_flightRecorder ` in json format:
5353```
54- bin/spark-submit --master local[*] --packages ch.cern.sparkmeasure:spark-measure_2.12 :0.23 \
54+ bin/spark-submit --master local[*] --packages ch.cern.sparkmeasure:spark-measure_2.13 :0.25 \
5555--conf spark.extraListeners=ch.cern.sparkmeasure.FlightRecorderStageMetrics \
5656examples/src/main/python/pi.py
5757
@@ -63,7 +63,7 @@ A Scala example
6363- same example as above, in addition use a custom output filename
6464- print metrics also to stdout
6565```
66- bin/spark-submit --master local[*] --packages ch.cern.sparkmeasure:spark-measure_2.12 :0.23 \
66+ bin/spark-submit --master local[*] --packages ch.cern.sparkmeasure:spark-measure_2.13 :0.25 \
6767--class org.apache.spark.examples.SparkPi \
6868--conf spark.extraListeners=ch.cern.sparkmeasure.FlightRecorderStageMetrics \
6969--conf spark.sparkmeasure.printToStdout=true \
@@ -80,7 +80,7 @@ This example collected metrics with Task granularity.
8080(note: source the Hadoop environment before running this)
8181```
8282bin/spark-submit --master yarn --deploy-mode cluster \
83- --packages ch.cern.sparkmeasure:spark-measure_2.12 :0.25 \
83+ --packages ch.cern.sparkmeasure:spark-measure_2.13 :0.25 \
8484--conf spark.extraListeners=ch.cern.sparkmeasure.FlightRecorderTaskMetrics \
8585--conf spark.sparkmeasure.outputFormat=json_to_hadoop \
8686--conf spark.sparkmeasure.outputFilename="hdfs://myclustername/user/luca/test/myoutput_$(date +%s).json" \
@@ -90,13 +90,13 @@ examples/src/main/python/pi.py
9090hdfs dfs -ls <path>/myoutput_*.json
9191```
9292
93- Example, use spark-3.3.0 , Kubernetes, Scala 2.12 and write output to S3:
93+ Example, use Spark 4 , Kubernetes, Scala 2.13 and write output to S3:
9494(note: export KUBECONFIG=... + setup Hadoop environment + configure s3a keys in the script)
9595```
9696bin/spark-submit --master k8s://https://XXX.XXX.XXX.XXX --deploy-mode client --conf spark.executor.instances=3 \
9797--conf spark.executor.cores=2 --executor-memory 6g --driver-memory 8g \
98- --conf spark.kubernetes.container.image=<registry-URL> /spark:v3.0.0_20190529_hadoop32 \
99- --packages org.apache.hadoop:hadoop-aws:3.3.2 ,ch.cern.sparkmeasure:spark-measure_2.12 :0.25 \
98+ --conf spark.kubernetes.container.image=apache /spark \
99+ --packages org.apache.hadoop:hadoop-aws:3.4.1 ,ch.cern.sparkmeasure:spark-measure_2.13 :0.25 \
100100--conf spark.hadoop.fs.s3a.secret.key="YYY..." \
101101--conf spark.hadoop.fs.s3a.access.key="ZZZ..." \
102102--conf spark.hadoop.fs.s3a.endpoint="https://s3.cern.ch" \
@@ -105,7 +105,7 @@ bin/spark-submit --master k8s://https://XXX.XXX.XXX.XXX --deploy-mode client --c
105105--conf spark.sparkmeasure.outputFormat=json_to_hadoop \
106106--conf spark.sparkmeasure.outputFilename="s3a://test/myoutput_$(date +%s).json" \
107107--class org.apache.spark.examples.SparkPi \
108- examples/jars/spark-examples_2.12-3.3.1 .jar 10
108+ examples/jars/spark-examples_2.13-4.4.0 .jar 10
109109```
110110
111111
@@ -115,7 +115,7 @@ To post-process the saved metrics you will need to deserialize objects saved by
115115This is an example of how to do that using the supplied helper object sparkmeasure.Utils
116116
117117```
118- bin/spark-shell --packages ch.cern.sparkmeasure:spark-measure_2.12 :0.25
118+ bin/spark-shell --packages ch.cern.sparkmeasure:spark-measure_2.13 :0.25
119119
120120val myMetrics = ch.cern.sparkmeasure.IOUtils.readSerializedStageMetricsJSON("/tmp/stageMetrics_flightRecorder")
121121// use ch.cern.sparkmeasure.IOUtils.readSerializedStageMetrics("/tmp/stageMetrics.serialized") for java serialization
0 commit comments