ML Task #2
Problem Statement:
Using clustering for some unsupervised learning!
Clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters)
For this task, there will not be a specified dataset. The selection of dataset is up to you but a good selection of dataset to showcase the power of clustering will be appreciated.
Here's an example dataset you might use for clustering: Link
Normal Mode:
Your task is to use k-means clustering to find clusters in your data.
Hacker Mode:
Use the more general algorithm, expectation maximisation clustering to maximize the overall probability or likelihood of
the data, given the final clusters.
The central idea: Instead of assigning cases or observations to clusters so as to maximize the differences in means for
continuous variables, the EM (expectation maximization) clustering algorithm rather computes probabilities of cluster
memberships based on one or more probability distributions.
Submission:
You are required to program in Python for the above task.
Normal Mode is required. Hacker Mode is highly encouraged :)
Deadline: 14th July 2018
Required Skills:
1. Python
Limitations:
Usage of libraries that offer clustering functions out of the box is not allowed.
Happy coding!