Last active
May 17, 2017 17:44
-
-
Save mehak-sachdeva/e3db643e2f9b0af4346a4c46a70ca773 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# JC Penny Store Closings\n", | |
"\n", | |
"\n", | |
"## Workflow\n", | |
"\n", | |
"Investigate JC Penny store closings$^1$ by:\n", | |
"\n", | |
"* Tagging locations as Urban vs Rural (using population density from the Data Observatory)\n", | |
"* Draw 10 minutes walk or drive isochrones based on whether the location is urban or not\n", | |
"* Visualize data with cartoframes\n", | |
"* Augment isochrones with Data Observatory measures\n", | |
"* Visualize data in Builder and add widgets for specific measures and store properties\n", | |
"\n", | |
"Final dashboard: https://team.carto.com/u/eschbacher/builder/0592fcae-3026-11e7-b861-0e3ebc282e83/embed\n", | |
"\n", | |
"1. closing status is real, but the actual close date is chosen randomly from the last five years" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Installing dependencies\n", | |
"\n", | |
"Install [cartoframes](https://github.com/cartodb/cartoframes) (which is currently in beta). I recommend installing in a virtual environment to keep things clean and sandboxed.\n", | |
"\n", | |
"## Getting the data\n", | |
"\n", | |
"Download the JC Penny store location data from here:\n", | |
"* <http://mehak-carto.carto.com/api/v2/sql?q=select%20*%20from%20jc_penny_stores&format=csv&filename=jc_penny_stores>" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Workflow for obtaining data\n", | |
"\n", | |
"Pull JC Penny locations from my CARTO account into cartoframes" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"import pandas as pd\n", | |
"import cartoframes\n", | |
"import json\n", | |
"import warnings\n", | |
"warnings.filterwarnings(\"ignore\")\n", | |
"\n", | |
"USERNAME = '' # <-- Put your carto username here\n", | |
"APIKEY = '' # <-- Put your carto api key here\n", | |
"\n", | |
"# use cartoframes.credentials.set_creds() to save credentials for future use\n", | |
"cc = cartoframes.CartoContext(api_key=APIKEY,\n", | |
" base_url='https://{}.carto.com/'.format(USERNAME))\n", | |
"table_name = 'jc_penny_stores'\n", | |
"\n", | |
"# load JC Penny locations into DataFrame\n", | |
"df = cc.read(table_name)\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## JC Penny Store Closings\n", | |
"\n", | |
"* Purple = stores closing\n", | |
"* Orange = stores staying open" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"from cartoframes import Layer\n", | |
"from cartoframes.styling import vivid\n", | |
"\n", | |
"cc.map(layers=Layer(table_name,\n", | |
" color={'column': 'status', 'scheme': vivid(10, 'category')}),\n", | |
" interactive=False)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Augment with DO to get 'urban-ness' metric (population density)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"# get population, other measures within 5 minute walk time\n", | |
"# More info about this Data Observatory measure here:\n", | |
"# https://cartodb.github.io/bigmetadata/united_states/age_gender.html#total-population\n", | |
"df = cc.data_augment(table_name, [{'numer_id': 'us.census.acs.B01003001',\n", | |
" 'normalization': 'area',\n", | |
" 'numer_timespan': '2011 - 2015'}])\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Get a sense of the range of data" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"df.describe()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Create isochrones based on travel inferences\n", | |
"\n", | |
"Create a derivative table with geometries as isochrones of walk/drive times from store locations. If pop density is above 5000 people / sq. km., assume it's a walkable area. Otherwise, assume cars are the primary mode of transit.\n", | |
"\n", | |
"**Note:** This functionality is a planned cartoframes method." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"%%time\n", | |
"df = cc.query('''\n", | |
" SELECT \n", | |
" CASE WHEN total_pop_area_2011_2015 > 5000\n", | |
" THEN (cdb_isochrone(the_geom, 'walk', Array[600])).the_geom\n", | |
" ELSE (cdb_isochrone(the_geom, 'car', Array[600])).the_geom\n", | |
" END as the_geom,\n", | |
" {keep_columns}\n", | |
" FROM\n", | |
" {table_name}\n", | |
" '''.format(table_name=table_name,\n", | |
" keep_columns=', '.join(set(df.columns) - {'the_geom', 'the_geom_webmercator'})))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"iso_table_name = (table_name + '_isochrones')" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"There is an issue in the repo already to introduce batch_api queries to avoid timeout:\n", | |
"https://github.com/CartoDB/cartoframes/issues/85\n", | |
"\n", | |
"There are bonus points to find bugs and open issues!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"cc.write(df, iso_table_name)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If this fails because of a lack of credits (i.e., reaching quota), then replace the `(cdb_isochrone(the_geom, 'walk', Array[600])).the_geom` pieces with `ST_Buffer(the_geom::geography, 800)::geometry` for an approximate 10 minute walk ('crow flies' distance), and `ST_Buffer(the_geom::geography, 12000)::geometry` for an approximate 10 minute drive (assuming 45 mph on average for 10 minutes)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false, | |
"scrolled": false | |
}, | |
"outputs": [], | |
"source": [ | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"from cartoframes import BaseMap\n", | |
"cc.map(layers=[BaseMap('light'),\n", | |
" Layer(iso_table_name),\n", | |
" Layer(table_name)],\n", | |
" zoom=12, lng=-73.9668, lat=40.7306,\n", | |
" interactive=False)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"# show choropleth of isochrones by pop density\n", | |
"from cartoframes.styling import vivid\n", | |
"cc.map(layers=[Layer(iso_table_name,\n", | |
" color='total_pop_area_2011_2015'),\n", | |
" Layer(table_name, size=6, color={'column': 'status', 'scheme': vivid(2)})],\n", | |
" zoom=8, lng=-74.7729, lat=39.9771,\n", | |
" interactive=False)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"# Data Observatory measures: median income, male age 30-34 (both ACS)\n", | |
"# Male age 30-34: https://cartodb.github.io/bigmetadata/united_states/age_gender.html#male-age-30-to-34\n", | |
"# Median Income: https://cartodb.github.io/bigmetadata/united_states/income.html#median-household-income-in-the-past-12-months\n", | |
"\n", | |
"# Note: this may take a minute or two because all the measures are being calculated based on the custom geographies\n", | |
"# that are passed in using spatially interpolated calculations (area-weighted measures)\n", | |
"\n", | |
"data_obs_measures = [{'numer_id': 'us.census.acs.B01001012'},\n", | |
" {'numer_id': 'us.census.acs.B19013001'}]\n", | |
"df = cc.data_augment(table_name + '_isochrones', data_obs_measures)\n", | |
"df.head()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"As you might have already heard, the Data Observatory just launched to help provide CartoDB users with a universe of data. One of the reasons we built the Data Observatory is because getting the third-party data you need is oftentimes the hardest part of analyzing your own data. Data wrangling shouldn't be such a big roadblock to mapping and analyzing your world.\n", | |
"\n", | |
"https://carto.com/blog/create-location-data-easily" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Visualize isochrones based on Data Observatory measure" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false, | |
"scrolled": false | |
}, | |
"outputs": [], | |
"source": [ | |
"cc.map(layers=Layer(iso_table_name,\n", | |
" color='median_income_prenormalized_2011_2015'),\n", | |
" zoom=8, lng=-74.3115, lat=40.1621,\n", | |
" interactive=False)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Builder Dashboard\n", | |
"\n", | |
"https://team.carto.com/u/eschbacher/builder/0592fcae-3026-11e7-b861-0e3ebc282e83/embed" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false, | |
"scrolled": false | |
}, | |
"outputs": [], | |
"source": [ | |
"from IPython.display import HTML\n", | |
"HTML('<iframe width=\"100%\" height=\"520\" frameborder=\"0\" src=\"https://team.carto.com/u/eschbacher/builder/0592fcae-3026-11e7-b861-0e3ebc282e83/embed\" allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>')" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.6.1" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment