huasanyelao / gist:91a4e83e6fc290e4e384c44fb568e104

Created December 15, 2017 07:17 — forked from cvogt/gist:9193220

Slick: Dynamic query conditions using the **MaybeFilter** (Updated to support nullable columns)

	import scala.slick.lifted.CanBeQueryCondition
	// optionally filter on a column with a supplied predicate
	case class MaybeFilter[X, Y](val query: scala.slick.lifted.Query[X, Y]) {
	def filter[T,R:CanBeQueryCondition](data: Option[T])(f: T => X => R) = {
	data.map(v => MaybeFilter(query.filter(f(v)))).getOrElse(this)
	}
	}

	// example use case
	import java.sql.Date

huasanyelao / bandit_simulations.py

Created August 14, 2017 09:25 — forked from anonymous/bandit_simulations.py

	import numpy as np
	from matplotlib import pylab as plt
	#from mpltools import style # uncomment for prettier plots
	#style.use(['ggplot'])

	'''
	function definitions
	'''
	# generate all bernoulli rewards ahead of time
	def generate_bernoulli_bandit_data(num_samples,K):

huasanyelao / zook_grow.md

Created January 21, 2016 08:25 — forked from miketheman/zook_grow.md

Adding nodes to a ZooKeeper ensemble

Adding 2 nodes to an existing 3-node ZooKeeper ensemble without losing the Quorum

Since many deployments may start out with 3 nodes and so little is known about how to grow a cluster from 3 memebrs to 5 members without losing the existing Quorum, here is an example of how this might be achieved.

In this example, all 5 nodes will be running on the same Vagrant host for the purpose of illustration, running on distinct configurations (ports and data directories) without the actual load of clients.

YMMV. Caveat usufructuarius.

Step 1: Have a healthy 3-node ensemble

huasanyelao / Demo2.R

Created December 31, 2015 09:21 — forked from zachmayer/Demo2.R

	#Setup
	rm(list = ls(all = TRUE))
	gc(reset=TRUE)
	set.seed(1234) #From random.org

	#Libraries
	library(caret)
	library(devtools)
	install_github('caretEnsemble', 'zachmayer') #Install zach's caretEnsemble package
	library(caretEnsemble)

huasanyelao / rank_metrics.py

Created December 17, 2015 12:53 — forked from bwhite/rank_metrics.py

Ranking Metrics

	"""Information Retrieval metrics

	Useful Resources:
	http://www.cs.utexas.edu/~mooney/ir-course/slides/Evaluation.ppt
	http://www.nii.ac.jp/TechReports/05-014E.pdf
	http://www.stanford.edu/class/cs276/handouts/EvaluationNew-handout-6-per.pdf
	http://hal.archives-ouvertes.fr/docs/00/72/67/60/PDF/07-busa-fekete.pdf
	Learning to Rank for Information Retrieval (Tie-Yan Liu)
	"""
	import numpy as np

huasanyelao / spark kryo 配置

Created December 11, 2015 12:56

	spark.serializer=org.apache.spark.serializer.KryoSerializer
	spark.kryo.registrator=com.uebercomputing.mailrecord.MailRecordRegistrator
	spark.kryoserializer.buffer.mb=128
	spark.kryoserializer.buffer.max.mb=512

huasanyelao / spark_gzip.py

Created December 11, 2015 09:29 — forked from msukmanowsky/spark_gzip.py

Example of how to save Spark RDDs to disk using GZip compression in response to https://twitter.com/rjurney/status/533061960128929793.

	from pyspark import SparkContext


	def main():
	sc = SparkContext(appName="Test Compression")
	# RDD has to be key, value pairs
	data = sc.parallelize([
	("key1", "value1"),
	("key2", "value2"),
	("key3", "value3"),

huasanyelao / Cooccurrence Analysis Prototype

Last active August 29, 2015 14:27 — forked from sscdotopen/Cooccurrence Analysis Prototype

	package org.apache.spark.examples

	import org.apache.spark.SparkContext
	import org.apache.spark.SparkContext._
	import org.apache.spark.rdd.RDD
	import java.util.Random
	import scala.collection.mutable
	import org.apache.spark.serializer.KryoRegistrator
	import com.esotericsoftware.kryo.Kryo

huasanyelao / schema.xml

Last active August 29, 2015 14:25 — forked from leoh/schema.xml

config for uuid field in solr 4.6

	<?xml version="1.0" encoding="UTF-8" ?>
	<!--
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

huasanyelao / gist:c0e97b752eb973046e9a

Created April 15, 2015 03:31

KafkaDirectGetOffsetPreciseExample fork from https://github.com/juanrh/data42/blob/4d0077544b0e6643dc9ae708de062635a3a0d497/spark13_playground/src/main/scala/com/github/juanrh/spark_kafka/KafkaDirectGetOffsetPreciseExample.scala

	package com.github.juanrh.spark_kafka

	import org.apache.spark.SparkConf
	import org.apache.spark.streaming.Seconds
	import org.apache.spark.streaming.StreamingContext
	import org.apache.spark.streaming.kafka.KafkaUtils

	import kafka.message.MessageAndMetadata
	import kafka.serializer.StringDecoder