Skip to content
This repository was archived by the owner on Dec 5, 2019. It is now read-only.
This repository was archived by the owner on Dec 5, 2019. It is now read-only.

Add an option to use Python 3.x on clusters and scheduled jobs #597

@robhudson

Description

@robhudson

Currently the default Python on the clusters is Python 2.7. I believe a long term goal should be to move to Python 3.x. But we may have to wait for Amazon. I'm opening this up for discussion and issue tracking against.

The latest Amazon EMR base Linux AMI (2017.03) installs Python 2.7 and 3.4. The Amazon EMR docs state:

Python Defaults

Python 3.4 is now installed by default, but Python 2.7 remains the system default. You may configure Python 3.4 as the system default using either a bootstrap action; you can use the configuration API to set PYSPARK_PYTHON export to /usr/bin/python3.4 in the spark-env classification to affect the Python version used by PySpark.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions