Skip to content

Instantly share code, notes, and snippets.

@karamanbk
Created June 3, 2019 05:54
Show Gist options
  • Save karamanbk/8af50168240621516e5722e4196d1533 to your computer and use it in GitHub Desktop.
Save karamanbk/8af50168240621516e5722e4196d1533 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from datetime import datetime, timedelta,date\n",
"import pandas as pd\n",
"%matplotlib inline\n",
"from sklearn.metrics import classification_report,confusion_matrix\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import seaborn as sns\n",
"from __future__ import division\n",
"from sklearn.cluster import KMeans\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import warnings\n",
"warnings.filterwarnings(\"ignore\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import plotly.plotly as py\n",
"import plotly.offline as pyoff\n",
"import plotly.graph_objs as go"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.svm import SVC\n",
"from sklearn.multioutput import MultiOutputClassifier\n",
"from sklearn.ensemble import GradientBoostingClassifier\n",
"from sklearn.tree import DecisionTreeClassifier\n",
"from sklearn.neighbors import KNeighborsClassifier\n",
"from sklearn.naive_bayes import GaussianNB\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"from sklearn.linear_model import LogisticRegression\n",
"import xgboost as xgb\n",
"from sklearn.model_selection import KFold, cross_val_score, train_test_split"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" <script type=\"text/javascript\">\n",
" window.PlotlyConfig = {MathJaxConfig: 'local'};\n",
" if (window.MathJax) {MathJax.Hub.Config({SVG: {font: \"STIX-Web\"}});}\n",
" if (typeof require !== 'undefined') {\n",
" require.undef(\"plotly\");\n",
" define('plotly', function(require, exports, module) {\n",
" /**\n",
"* plotly.js v1.47.3\n",
"* Copyright 2012-2019, Plotly, Inc.\n",
"* All rights reserved.\n",
"* Licensed under the MIT license\n",
"*/\n",
@tanviranik
Copy link

Awesome

@punsisi2018861
Copy link

This is very much detailed and very informative.

@nasimdaneshtalab
Copy link

nasimdaneshtalab commented Dec 16, 2020

Thank you for the great code and explanation, I would be really nice if we could have explore more of the ways that are possible to increase the model accuracy. As my models are not that accurate unfortunately.

@govindamagrawal
Copy link

Just one question, how can a 'Customer ID', which is a actually a categorical data, part of the training dataset? If 'Customer ID' is '12747.0', it does not make any sense in the training data, as it could be any other number like '435666666666' or 'ABCD' or '536TGK5'.
Now, if you remove the 'Customer ID' from training, how will you test on the test dataset by predicting which of the customers will buy in the next week or so?

@ctran2
Copy link

ctran2 commented Jun 26, 2021

Thank you for the great code and explanation, I would be really nice if we could have explore more of the ways that are possible to increase the model accuracy. As my models are not that accurate unfortunately.

There's probably an overfitting problem as the accuracy for the test set is way lower than that of the training set.
Accuracy of XGB classifier on training set: 0.92
Accuracy of XGB classifier on test set: 0.62

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment