Skip to content

Instantly share code, notes, and snippets.

@wliu
wliu / spam_detector.md
Last active April 11, 2018 21:45
Spam Detector

Spam Detector

Welcome to UnitedMasters. This challenge helps us assess engineering expertise and creative thinking, while enabling you to get a better understanding of the music domain. We also think this challenge is a just a fun exercise for anyone that loves to write code. Feel free to ask questions or get clarification on anything.

Dataset

The dataset for this challenge, dataset.tar, is a archive containing three gzipped JSON Lines files:

sc_tracks.json.gz: contains ~6000 Soundcloud track objects from @corpus, our internal datastore. The track object mirrors the [Soundcloud Track API] (https://developers.soundcloud.com/docs/api/reference#tracks) with UnitedMaster specific fields denoted by a leading "_" character in the field name.