Skip to content

Instantly share code, notes, and snippets.

@tjworks
Created May 26, 2016 09:06
Show Gist options
  • Save tjworks/07d3aab1bd9c8a81ed23ed6196608534 to your computer and use it in GitHub Desktop.
Save tjworks/07d3aab1bd9c8a81ed23ed6196608534 to your computer and use it in GitHub Desktop.
mongo-spark demo app.java
package com.example;
import java.util.HashMap;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.broadcast.Broadcast;
import org.bson.Document;
import com.mongodb.spark.api.java.MongoSpark;
public class App
{
public static void main( String[] args )
{
System.out.println( "Hello World!" );
SparkConf sparkCfg = new SparkConf();
sparkCfg.set("spark.mongodb.input.uri", "mongodb://127.0.0.1/test.input");
sparkCfg.set("spark.mongodb.output.uri", "mongodb://127.0.0.1/test.output");
sparkCfg.set("spark.mongodb.input.maxChunkSize", "1");
JavaSparkContext jsc = new JavaSparkContext(sparkCfg);;
HashMap calcDependencies = getDependencies();
Broadcast<HashMap> broadcastedDependencies = jsc.broadcast(calcDependencies);
JavaRDD<Document> rdd = MongoSpark.load(jsc)
.map(doc -> {
Document mdoc = (Document) doc;
mdoc = calculateFare(mdoc);
return mdoc;
});
MongoSpark.save(rdd);
rdd.collect();
jsc.stop();
}
private static HashMap getDependencies(){
return new HashMap(); // do some checkint beforeproduction!
}
private static Document calculateFare(Document input){
input.put("fare", 100);
return input;
}
}
@samjoshuaberchmans
Copy link

samjoshuaberchmans commented Jun 30, 2020

Hi , Do you have any sample code for Loading Data from MongoDB?
I am facing some issue with class not found for the same. Let me know if you have any sample code for the same.
I am using v1.1 Mongo Spark connector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment