Skip to content

Instantly share code, notes, and snippets.

@mhamilton723
Last active May 19, 2018 00:46
Show Gist options
  • Save mhamilton723/cb2ba0fd1115a3e74c77621ac52f9ec5 to your computer and use it in GitHub Desktop.
Save mhamilton723/cb2ba0fd1115a3e74c77621ac52f9ec5 to your computer and use it in GitHub Desktop.
#!/bin/bash
newspark="spark-2.3.0"
SPARK_DIR="$(readlink -f "/usr/hdp/current/spark2-client")"
SPARK_CONF_DIR="$(readlink -f "/usr/hdp/current/spark2-client/conf")"
CURRENT_DIR=${SPARK_DIR%/spark2}
HADOOP_DIR="$CURRENT_DIR/hadoop"
HADOOP_YARN_DIR="$CURRENT_DIR/hadoop-yarn"
## Download & Install Binary
cd "/tmp"
curl "http://apache.claz.org/spark/$newspark/$newspark-bin-hadoop2.7.tgz" | tar xzf -
cd "$newspark-bin-hadoop2.7"
rm -r "jars/hadoop"* "conf"
ln -s "$SPARK_CONF_DIR" "conf"
cd ..
rm -r "$SPARK_DIR"
mv "$newspark-bin-hadoop2.7" "$SPARK_DIR"
# Create symlinks
sudo ln -sfn "$SPARK_DIR/yarn/$newspark-yarn-shuffle.jar" \
"$HADOOP_DIR/lib/spark-yarn-shuffle.jar"
sudo ln -sf $HADOOP_DIR /usr/hdp/current/hadoop
sudo ln -sf $HADOOP_YARN_DIR /usr/hdp/current/hadoop-yarn
echo "$newspark installation completed"
#/usr/bin/anaconda/bin/pip install --upgrade pandas
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment