Skip to content

Instantly share code, notes, and snippets.

@wilmeragsgh
Last active May 8, 2020 23:49
Show Gist options
  • Save wilmeragsgh/b2277a8ec41d833e319c513f2c2aa064 to your computer and use it in GitHub Desktop.
Save wilmeragsgh/b2277a8ec41d833e319c513f2c2aa064 to your computer and use it in GitHub Desktop.
Python script for testing pyspark installation
# Maybe required:
# import os
# os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
# os.environ["SPARK_HOME"] = "/opt/spark"
# Test installation
# import findspark
# findspark.init()
import pyspark
import random
num_samples = 1000000
sc = pyspark.SparkContext(appName="Pi")
def inside(p):
x, y = random.random(), random.random()
return x*x + y*y < 1
count = sc.parallelize(range(0, num_samples)).filter(inside).count()
pi = 4 * count / num_samples
print(pi)
sc.stop()
# 3.14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment