Skip to content

Instantly share code, notes, and snippets.

View benadaba's full-sized avatar

Bernard Antwi Adabankah benadaba

  • London
View GitHub Profile
@mick001
mick001 / logistic_regression.R
Last active June 14, 2024 07:59
Logistic regression tutorial code. Full article available at http://datascienceplus.com/perform-logistic-regression-in-r/
# Load the raw training data and replace missing values with NA
training.data.raw <- read.csv('train.csv',header=T,na.strings=c(""))
# Output the number of missing values for each column
sapply(training.data.raw,function(x) sum(is.na(x)))
# Quick check for how many different values for each feature
sapply(training.data.raw, function(x) length(unique(x)))
# A visual way to check for missing data
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
# -*- coding: utf-8 -*-
""" Small script that shows hot to do one hot encoding
of categorical columns in a pandas DataFrame.
See:
http://scikit-learn.org/dev/modules/generated/sklearn.preprocessing.OneHotEncoder.html#sklearn.preprocessing.OneHotEncoder
http://scikit-learn.org/dev/modules/generated/sklearn.feature_extraction.DictVectorizer.html
"""
import pandas
import random