Skip to content

Instantly share code, notes, and snippets.

@embiem
Last active October 22, 2019 14:58
Show Gist options
  • Save embiem/0434fe421b06ee13f92db9ff7991ca99 to your computer and use it in GitHub Desktop.
Save embiem/0434fe421b06ee13f92db9ff7991ca99 to your computer and use it in GitHub Desktop.
ML Intro & ML in GameDev Workshop.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "ML Intro & ML in GameDev Workshop.ipynb",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true,
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/embiem/0434fe421b06ee13f92db9ff7991ca99/ml-intro-ml-in-gamedev-workshop.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "YdrOqn0OBSpx",
"colab_type": "text"
},
"source": [
"# ML Intro & ML in GameDev\n",
"\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "RZARGEjBBAVk",
"colab_type": "code",
"colab": {}
},
"source": [
"# Import necessary libraries\n",
"import numpy as np\n",
"import pandas as pd\n",
"import io\n",
"\n",
"%matplotlib inline\n",
"\n",
"# Load the dataset\n",
"from google.colab import files\n",
"uploaded = files.upload()\n",
"\n",
"file_name = next(iter(uploaded.keys()))\n",
"\n",
"data = pd.read_csv(io.BytesIO(uploaded[file_name]))\n",
"playerYs = data['playerY']\n",
"features = data.drop('playerY', axis = 1)\n",
" \n",
"# Success\n",
"print(\"The dataset has {} data points with {} variables.\".format(*data.shape))\n",
"data.head()"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "f3SbmFE9E6_z",
"colab_type": "text"
},
"source": [
"## Data Exploration"
]
},
{
"cell_type": "code",
"metadata": {
"id": "QwfgOccqFL60",
"colab_type": "code",
"colab": {}
},
"source": [
"# TODO: Minimum playerYs of the data\n",
"minimum_playerYs = \n",
"\n",
"# TODO: Maximum playerYs of the data\n",
"maximum_playerYs = \n",
"\n",
"# TODO: Mean playerYs of the data\n",
"mean_playerYs = \n",
"\n",
"# TODO: Median playerYs of the data\n",
"median_playerYs = \n",
"\n",
"# TODO: Standard deviation of playerYs of the data\n",
"std_playerYs = \n",
"\n",
"print(\"Min: {:,.4f}\".format(minimum_playerYs))\n",
"print(\"Max: {:,.4f}\".format(maximum_playerYs))\n",
"print(\"Mean: {:,.4f}\".format(mean_playerYs))\n",
"print(\"Median: {:,.4f}\".format(median_playerYs))\n",
"print(\"Standard deviation: {:,.4f}\".format(std_playerYs))"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "XT2J-8OQ7VG5",
"colab_type": "text"
},
"source": [
"### Measures of Center\n",
"\n",
"**Mean**: sum of the values divided by the number of values\n",
"\n",
"**Median**: sort the data and pick the value which lies in the middle, or for a even count the average of the two values in the middle. It has a robust tendency, which means it won’t be affected by outliers as much as the mean."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "HKOxxPMz94E5",
"colab_type": "text"
},
"source": [
"### Measures of Spread\n",
"\n",
"**Standard Deviation**: measure of the amount of variation. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the values are spread out over a wider range.\n",
"\n",
"Calculated by taking the square root of average squared deviation.\n",
"\n",
"*Standard deviation is an excellent way to identify outliers*. Data points that lie more than one standard deviation from the mean can be considered unusual.\n",
"\n",
"![alt text](https://upload.wikimedia.org/wikipedia/commons/8/8c/Standard_deviation_diagram.svg)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FXEbkvgY948F",
"colab_type": "text"
},
"source": [
"### Pairplot Graph\n",
"\n",
"Plot pairwise relationships in a dataset."
]
},
{
"cell_type": "code",
"metadata": {
"id": "qvB-iNebLNUO",
"colab_type": "code",
"colab": {}
},
"source": [
"import seaborn as sns; sns.set()\n",
"sns.pairplot(data);"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "lYFy1yLzIEtD",
"colab_type": "text"
},
"source": [
"## Shuffle and Split Data"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Kl3BK35dINHm",
"colab_type": "code",
"colab": {}
},
"source": [
"from sklearn.model_selection import train_test_split\n",
"\n",
"# TODO Shuffle and split the data into training and testing subsets\n",
"X_train, X_test, y_train, y_test = "
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "HvIgH_3_QeGk",
"colab_type": "text"
},
"source": [
"### Download the shuffled & split data\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "8wzyp_MmQkVe",
"colab_type": "code",
"colab": {}
},
"source": [
"X_train.to_csv(\"train-features.csv\", index=False)\n",
"y_train.to_csv(\"train-target.csv\", index=False, header=\"playerY\")\n",
"X_test.to_csv(\"test-features.csv\", index=False)\n",
"y_test.to_csv(\"test-target.csv\", index=False, header=\"playerY\")\n",
"\n",
"files.download(\"train-features.csv\")\n",
"files.download(\"train-target.csv\")\n",
"files.download(\"test-features.csv\")\n",
"files.download(\"test-target.csv\")"
],
"execution_count": 0,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment