This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
samtools mpileup --fasta-ref ../../reference/Homo_sapiens_assembly18_chr1_chrM.small.fasta -A -q 0 -Q 0 NA12878.multichrom.md.bam -o samtools.pileup2 | |
samtools mpileup --fasta-ref ../../reference/Homo_sapiens_assembly18_chr1_chrM.small.fasta -B -x -A -q 0 -Q 0 NA12878.multichrom.md.bam > samtools_x.pileup | |
sed $'s/\t"/\t\\\\"/g' samtools_x.pileup> samtools_x_esc.pileup |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cd /data/samples/CORRIELL/mbi_cwiczenie3 | |
### create sequence dictionary | |
docker run --rm -it \ | |
-v /data/samples/CORRIELL/mbi_cwiczenie3:/data \ | |
broadinstitute/picard \ | |
CreateSequenceDictionary \ | |
R=/data/chr1.fa \ | |
O=/data/chr1.dict |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
export SPARK_HOME=/data/local/opt/spark-2.4.3-bin-hadoop2.7 | |
rm -rf /data/local/cache/ivy2/repository/cache/org.biodatageeks/bdg-seqtender_2.11/ | |
rm /data/local/cache/ivy2/repository/jars/org.biodatageeks_bdg-seqtender_2.11-0.2-SNAPSHOT.jar | |
## master local, defaultFS = HDFS | |
./bin/spark-shell -v \ | |
--master local \ | |
--driver-memory 2g \ | |
--conf "spark.sql.catalogImplementation=in-memory" \ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## push to nexus | |
curl -v --user 'user:pass' --upload-file oap-1.0.0-spark-2.4.3-SNAPSHOT.jar http://zsibio.ii.pw.edu.pl/nexus/repository/maven-snapshots/org/intel/bigdata/oap/1.0.0-spark-2.4.3-SNAPSHOT/oap-1.0.0-spark-2.4.3-SNAPSHOT.jar | |
## sbt without tests in assembly | |
sbt 'set test in assembly := {}' clean assembly |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
export SPARK_HOME= /data/local/opt/spark-2.4.3-bin-hadoop2.7 | |
cd $SPARK_HOME | |
## scala v 2.11 (!!) | |
./bin/spark-shell -v --master yarn-client --num-executors 20 --driver-memory 2g --executor-memory 2g \ | |
--jars /tmp/bdg-sequila-acc_2.11-0.1-spark-2.4.3-SNAPSHOT-assembly.jar \ | |
--conf spark.sql.extensions=org.biodatageeks.sequila.spark.BdgExtensions \ | |
--conf spark.hadoop.yarn.timeline-service.enabled=false \ | |
--conf spark.hadoop.hive.metastore.uris=thrift://cdh01.cl.ii.pw.edu.pl:9083 \ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
unset SPARK_HOME | |
cd /data/local/opt/spark-2.4.3-bin-hadoop2.7 | |
./bin/spark-shell -v --master yarn-client --num-executors 20 --driver-memory 2g --executor-memory 2g \ | |
--conf spark.hadoop.yarn.timeline-service.enabled=false \ | |
--conf spark.hadoop.hive.metastore.uris=thrift://cdh01.cl.ii.pw.edu.pl:9083 \ | |
--conf spark.hadoop.yarn.timeline-service.enabled=false \ | |
--conf spark.driver.extraJavaOptions=-Dhdp.version=3.1.0.0-78 \ | |
--conf spark.yarn.am.extraJavaOptions=-Dhdp.version=3.1.0.0-78 \ | |
--conf spark.hadoop.metastore.catalog.default=hive |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
./spark-shell -v --master yarn-client --driver-memory 1G --executor-memory 2G --executor-cores 2 \ | |
--jars /tmp/apache-carbondata-1.6.0-SNAPSHOT-bin-spark2.3.2-hadoop2.7.2.jar \ | |
--conf spark.hadoop.hive.metastore.uris=thrift://cdh01.cl.ii.pw.edu.pl:9083 \ | |
--conf spark.hadoop.yarn.timeline-service.enabled=false \ | |
--conf spark.driver.extraJavaOptions=-Dhdp.version=3.1.0.0-78 \ | |
--conf spark.yarn.am.extraJavaOptions=-Dhdp.version=3.1.0.0-78 \ | |
--conf spark.hadoop.metastore.catalog.default=hive | |
import org.apache.spark.sql.SparkSession |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
docker pull biodatageeks/bdg-sequila:0.5.5-spark-2.4.2-SNAPSHOT | |
docker run -it --rm \ | |
-e USERID=$UID -e GROUPID=$(id -g) \ | |
-v /Users/aga/workplace/data/slice:/data \ | |
biodatageeks/bdg-sequila:0.5.5-spark-2.4.2-SNAPSHOT \ | |
depthOfCoverage \ | |
--master=local --driver-memory=8g \ | |
-- \ | |
--reads /data/NA12878.slice.bam --format blocks -o /data/NA12878.cov.bed |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cd /data/local/opt/spark-2.4.0-bin-hadoop2.7/bin | |
./spark-shell -v --master=yarn --deploy-mode=client --num-executors=60 --executor-memory=4g --driver-memory=12g --conf spark.sql.catalogImplementation=in-memory --conf spark.jars.ivy=/data/local/cache/ivy2/repository --conf spark.hadoop.yarn.timeline-service.enabled=false --repositories http://zsibio.ii.pw.edu.pl/nexus/repository/maven-releases/,http://zsibio.ii.pw.edu.pl/nexus/repository/maven-snapshots/ --packages org.biodatageeks:bdg-sequila_2.11:0.5.5-spark-2.4.2-SNAPSHOT | |
sc.setLogLevel("WARN") | |
import org.apache.spark.sql.SequilaSession | |
import org.biodatageeks.utils.{SequilaRegister, UDFRegister,BDGInternalParams} | |
val ss = SequilaSession(spark) | |
SequilaRegister.register(ss) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
////// RUN | |
// map volumes according to your data directory | |
docker run -it --rm \ | |
-e USERID=$UID -e GROUPID=$(id -g) \ | |
-v /Users/aga/workplace/data/slice/:/data \ | |
biodatageeks/bdg-sequila:0.5.5-spark-2.4.2-SNAPSHOT \ | |
spark-shell --driver-memory=4g \ | |
--jars /tmp/bdg-toolset/bdg-sequila-assembly-0.5.5-spark-2.4.2-SNAPSHOT.jar \ | |
--conf spark.sql.warehouse.dir=/home/bdgeek/spark-warehouse |
NewerOlder