Skip to content

Instantly share code, notes, and snippets.

@kevin-m-kent
Last active August 2, 2022 11:47
Show Gist options
  • Save kevin-m-kent/9691fea4ba7bf1a95b77cbee81e2ce59 to your computer and use it in GitHub Desktop.
Save kevin-m-kent/9691fea4ba7bf1a95b77cbee81e2ce59 to your computer and use it in GitHub Desktop.
FROM ubuntu:18.04
RUN apt-get update \
&& apt-get -y upgrade \
&& apt-get install --yes \
build-essential \
openjdk-8-jdk \
iproute2 \
bash \
sudo \
coreutils \
procps \
&& /var/lib/dpkg/info/ca-certificates-java.postinst configure \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN gcc --version
# Suppress interactive configuration prompts
ENV DEBIAN_FRONTEND=noninteractive
# Install python 3.8 and virtualenv for Spark and Notebooks
RUN apt-get update \
&& apt-get install -y \
python3.8 \
virtualenv
RUN apt-get update \
&& apt-get install --yes software-properties-common apt-transport-https \
&& gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 \
&& gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | sudo apt-key add - \
&& add-apt-repository -y "deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu $(lsb_release -cs)-cran40/" \
&& apt-get update \
&& apt-get install --yes \
libssl-dev \
r-base \
r-base-dev \
&& add-apt-repository -r "deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu $(lsb_release -cs)-cran40/" \
&& apt-key del E298A3A825C0D65DFD57CBB651716619E084DAB9 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN virtualenv -p python3.8 --system-site-packages /databricks/python3
# hwriterPlus is used by Databricks to display output in notebook cells
# hwriterPlus is removed for newer version of R, so we hardcode the dependency to archived version
# Rserve allows Spark to communicate with a local R process to run R code
RUN R -e "options(repos = list(MRAN = 'https://mran.microsoft.com/snapshot/2022-04-08', CRAN = 'https://cran.microsoft.com/')); install.packages(c('hwriter', 'TeachingDemos', 'htmltools'), lib = '/usr/lib/R/library')" \
&& R -e "install.packages('https://cran.r-project.org/src/contrib/Archive/hwriterPlus/hwriterPlus_1.0-3.tar.gz', repos=NULL, type='source', lib = '/usr/lib/R/library')"
RUN sudo apt-get update && apt-get install -y zlib1g-dev \
pandoc \
make \
libcurl4-openssl-dev \
libssl-dev \
zlib1g-dev \
pandoc-citeproc \
libicu-dev \
curl \
libxml2-dev \
python-six
RUN R -e "install.packages('pak')"
RUN R -e "pak::pkg_install(c( 'htmltools', 'tidyverse', 'tidymodels', 'GGally', 'fable', 'tsibble', 'ranger', 'xgboost', 'modeltime', 'timetk'), lib = '/usr/lib/R/library')"
RUN R -e "pak::pkg_install('Rserve', lib = '/databricks/spark/R/lib')"
COPY Rprofile.site /usr/lib/R/etc/Rprofile.site
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment