Last active
May 8, 2020 21:24
-
-
Save wilmeragsgh/01ead286358df670caba3cf924829550 to your computer and use it in GitHub Desktop.
Bash script for setting up spark standalone (ubuntu)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apt-get install openjdk-8-jdk-headless | |
SPARK_RELEASE=spark-3.0.0-preview2 # Update as required from here https://spark.apache.org/downloads.html | |
HADOOP_VERSION=hadoop2.7 # Update as required | |
SPARK_DOWNLOADER_FILENAME=$SPARK_RELEASE-bin-$HADOOP_VERSION | |
wget -q https://www.apache.org/dyn/closer.lua/spark/$SPARK_RELEASE/$SPARK_DOWNLOADER_FILENAME.tgz | |
tar -xzf $SPARK_DOWNLOADER_FILENAME.tgz | |
mv $SPARK_DOWNLOADER_FILENAME /opt/$SPARK_RELEASE | |
ln -s /opt/$SPARK_RELEASE /opt/spark | |
export SPARK_HOME=/opt/spark | |
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 | |
export PATH=$SPARK_HOME/bin:$PATH | |
# For installing Pyspark # Make sure you are in the correct environment you want to use Spark | |
pip install pyspark | |
pip install -q findspark | |
# See https://gist.github.com/wilmeragsgh/b2277a8ec41d833e319c513f2c2aa064 for how to test the installation in python |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Some references: