Last active
January 4, 2016 01:39
-
-
Save milessabin/8549878 to your computer and use it in GitHub Desktop.
Proof of concept translation of some Scalding examples (https://github.com/twitter/scalding/wiki/Fields-based-API-Reference) to shapeless records (https://github.com/milessabin/shapeless/wiki/Feature-overview:-shapeless-2.0.0#extensible-records)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import shapeless._ | |
import record._ | |
import syntax.singleton._ | |
object ScaldingPoC extends App { | |
// map, flatMap | |
val birds = | |
List( | |
"name" ->> "Swallow (European, unladen)" :: "speed" ->> 23 :: "weightLb" ->> 0.2 :: "heightFt" ->> 0.65 :: HNil, | |
"name" ->> "African (European, unladen)" :: "speed" ->> 24 :: "weightLb" ->> 0.21 :: "heightFt" ->> 0.6 :: HNil | |
) | |
val fasterBirds = birds.map(b => b + ("doubleSpeed" ->> b("speed")*2)) | |
fasterBirds foreach println | |
val britishBirds = birds.map(b => b + ("weightKg" ->> b("weightLb")*0.454) + ("heightM" ->> b("heightFt")*0.305)) | |
britishBirds foreach println | |
val items = | |
List( | |
"author" ->> "Benjamin Pierce" :: "title" ->> "Types and Programming Languages" :: "price" ->> 49.35 :: HNil, | |
"author" ->> "Roger Hindley" :: "title" ->> "Basic Simple Type Theory" :: "price" ->> 23.14 :: HNil | |
) | |
val pricierItems = items.map(i => i + ("price" ->> i("price")*1.1)) | |
pricierItems foreach println | |
val books = | |
List( | |
"text" ->> "Not everyone knows how I killed old Phillip Mathers" :: HNil, | |
"text" ->> "No, no, I can't tell you everything" :: HNil | |
) | |
val lines = books.flatMap(book => for(word <- book("text").split("\\s+")) yield book + ("word" ->> word)) | |
lines foreach println | |
} |
We will need to look at serialization here because, as Dean notes, we definitely don't want to serialize the keys with each row. We'd have to look at how Kryo does (or can be made to) serialize the records.
@johnynek The keys don't exist at all at runtime.
Here's an example showing Kryo serialization of record types: https://gist.github.com/bsidhom/9798005
The record type takes no more space than its underlying HList, which isn't too bad when registered. I haven't found a way to register classes more concisely unfortunately, but it may be possible to remove some boilerplate given an example instance (via getClass
) or with the help of macros.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Cool. I need to take a look at the latest implementation.