Skip to content

Instantly share code, notes, and snippets.

@jhconning
Created October 22, 2018 05:19
Show Gist options
  • Save jhconning/3b815295351e9bb8ea696fcf9953c50f to your computer and use it in GitHub Desktop.
Save jhconning/3b815295351e9bb8ea696fcf9953c50f to your computer and use it in GitHub Desktop.
Hunter/Masters/Jeremy Sze/MA Thesis - Copy/jeremy_python.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "## Plotting Jeremy's LPIS data\n\n"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "import pandas as pd\nimport matplotlib.pyplot as plt\nimport geopandas as gpd\nimport folium \nfrom folium.plugins import MarkerCluster, HeatMap\nimport osmnx as ox\nimport fiona\n\nfrom shapely.geometry import Point\nimport matplotlib.pyplot as plt\n\npd.options.display.float_format = '{:,.2f}'.format",
"execution_count": 96,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Let's read in the amazing Stata dataset Jeremy prepared into a pandas dataframe. "
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "df = pd.read_stata(\"./working_data/analytical_file_panel.dta\")",
"execution_count": 2,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Now create a geometry from those `latitude` and `longitude` measures to turn this into a geodataframe. \n\nNote however that these are not in fact true `longitude` and `latitude` measures but rather measurements in feet since the geometry used here is [EPSG:2263](http://spatialreference.org/ref/epsg/2263/) also referred to as NAD83. \n\nTo allow true latitude and longitude columns later let's rename these something else:"
},
{
"metadata": {
"scrolled": false,
"trusted": true
},
"cell_type": "code",
"source": "df.rename(index=str, columns={'longitude': 'ft_lon', 'latitude': 'ft_lat'},inplace=True)",
"execution_count": 12,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "df['Coordinates'] = list(zip(df.ft_lon, df.ft_lat))\ndf['Coordinates'] = df['Coordinates'].apply(Point)\ngdf = gpd.GeoDataFrame(df, geometry='Coordinates')",
"execution_count": 14,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "We also need to make it clear that the Coordinate Reference System (CRS) for these measures is EPSG 2263. With that we can later correctly convert from that to EPSG 4326 to use true latitude and longitude. This is needed only because that's what we need for a Folium (leaflet) map."
},
{
"metadata": {
"scrolled": false,
"trusted": true
},
"cell_type": "code",
"source": "gdf.crs = fiona.crs.from_epsg(2263)\ngdf = gdf.to_crs(epsg='4326')",
"execution_count": 15,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Now, finally we can create columns for `latitude` and `longitude` coordinates."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "gdf['longitude'] = gdf.Coordinates.x\ngdf['latitude'] = gdf.Coordinates.y",
"execution_count": 16,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Let's see the list of variables and representative values."
},
{
"metadata": {
"scrolled": true,
"trusted": true
},
"cell_type": "code",
"source": "pd.set_option('display.max_rows', 120)\ngdf.head(1).T",
"execution_count": 17,
"outputs": [
{
"data": {
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>0</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>intersection_id</th>\n <td>1</td>\n </tr>\n <tr>\n <th>month</th>\n <td>7</td>\n </tr>\n <tr>\n <th>year</th>\n <td>2012</td>\n </tr>\n <tr>\n <th>monthly</th>\n <td>2012-07-01 00:00:00</td>\n </tr>\n <tr>\n <th>ft_lon</th>\n <td>1,025,301.15</td>\n </tr>\n <tr>\n <th>ft_lat</th>\n <td>270,701.93</td>\n </tr>\n <tr>\n <th>bname</th>\n <td>Bronx</td>\n </tr>\n <tr>\n <th>bronx</th>\n <td>1</td>\n </tr>\n <tr>\n <th>brooklyn</th>\n <td>0</td>\n </tr>\n <tr>\n <th>manhattan</th>\n <td>0</td>\n </tr>\n <tr>\n <th>queens</th>\n <td>0</td>\n </tr>\n <tr>\n <th>statenisland</th>\n <td>0</td>\n </tr>\n <tr>\n <th>inters_to_sch_dist</th>\n <td>2778</td>\n </tr>\n <tr>\n <th>inters_to_hosp_dist</th>\n <td>6439</td>\n </tr>\n <tr>\n <th>lionstreet_id</th>\n <td>29364</td>\n </tr>\n <tr>\n <th>trafdir</th>\n <td>Two-Way: Traffic flows in both directions</td>\n </tr>\n <tr>\n <th>streetwidth_min</th>\n <td>24.00</td>\n </tr>\n <tr>\n <th>streetwidth_max</th>\n <td>30.00</td>\n </tr>\n <tr>\n <th>streetwidth_irr</th>\n <td>NaN</td>\n </tr>\n <tr>\n <th>posted_speed</th>\n <td>25.00</td>\n </tr>\n <tr>\n <th>number_travel_lanes</th>\n <td>1.00</td>\n </tr>\n <tr>\n <th>mi_number_travel_lanes</th>\n <td>0</td>\n </tr>\n <tr>\n <th>number_park_lanes</th>\n <td>0</td>\n </tr>\n <tr>\n <th>mi_number_park_lanes</th>\n <td>1</td>\n </tr>\n <tr>\n <th>number_total_lanes</th>\n <td>1.00</td>\n </tr>\n <tr>\n <th>bike_route_id</th>\n <td>250,515.00</td>\n </tr>\n <tr>\n <th>bike_route_to_coli_dist</th>\n <td>3,303.43</td>\n </tr>\n <tr>\n <th>bike_route_install_dt</th>\n <td>NaT</td>\n </tr>\n <tr>\n <th>lanecount</th>\n <td>NaN</td>\n </tr>\n <tr>\n <th>bike_route_modif_dt</th>\n <td>NaT</td>\n </tr>\n <tr>\n <th>onoffst</th>\n <td></td>\n </tr>\n <tr>\n <th>bike_route_install_mt</th>\n <td>NaT</td>\n </tr>\n <tr>\n <th>bike_route_modif_mt</th>\n <td>NaT</td>\n </tr>\n <tr>\n <th>truck_route_id</th>\n <td>5383</td>\n </tr>\n <tr>\n <th>truck_route_to_coli_dist</th>\n <td>1,293.15</td>\n </tr>\n <tr>\n <th>bikelane_type</th>\n <td>NaN</td>\n </tr>\n <tr>\n <th>routetype</th>\n <td>Local</td>\n </tr>\n <tr>\n <th>monthly_avg_drybulbtemp</th>\n <td>78.68</td>\n </tr>\n <tr>\n <th>monthly_avg_precip</th>\n <td>0.14</td>\n </tr>\n <tr>\n <th>monthly_avg_snowfall</th>\n <td>0.00</td>\n </tr>\n <tr>\n <th>monthly_tot_drybulbtemp</th>\n <td>2,439.00</td>\n </tr>\n <tr>\n <th>monthly_tot_precip</th>\n <td>4.21</td>\n </tr>\n <tr>\n <th>monthly_tot_snowfall</th>\n <td>0.00</td>\n </tr>\n <tr>\n <th>left_turn_id</th>\n <td>104</td>\n </tr>\n <tr>\n <th>intersection_to_left_turn_dist</th>\n <td>4,050.51</td>\n </tr>\n <tr>\n <th>left_turn_install_dt</th>\n <td>2017-12-09 00:00:00</td>\n </tr>\n <tr>\n <th>left_turn_install_month</th>\n <td>2017-12-01 00:00:00</td>\n </tr>\n <tr>\n <th>left_turn_treatment</th>\n <td>Daylighting, Box markings, Pegatracks, Delinea...</td>\n </tr>\n <tr>\n <th>left_turn_min</th>\n <td>0.19</td>\n </tr>\n <tr>\n <th>flag_left_turn_ever</th>\n <td>0</td>\n </tr>\n <tr>\n <th>flag_left_turn</th>\n <td>0</td>\n </tr>\n <tr>\n <th>street_improv_id</th>\n <td>173</td>\n </tr>\n <tr>\n <th>intersec_to_street_improv_dist</th>\n <td>8,243.45</td>\n </tr>\n <tr>\n <th>street_improv_treatment</th>\n <td></td>\n </tr>\n <tr>\n <th>street_improv_install_dt</th>\n <td>2017-12-31 00:00:00</td>\n </tr>\n <tr>\n <th>street_improv_install_month</th>\n <td>2017-12-01 00:00:00</td>\n </tr>\n <tr>\n <th>street_improv_min</th>\n <td>395.85</td>\n </tr>\n <tr>\n <th>flag_street_improv_ever</th>\n <td>0</td>\n </tr>\n <tr>\n <th>flag_street_improv</th>\n <td>0</td>\n </tr>\n <tr>\n <th>LPIS_id</th>\n <td>1767</td>\n </tr>\n <tr>\n <th>intersection_to_LPIS_dist</th>\n <td>2,251.85</td>\n </tr>\n <tr>\n <th>LPIS_install_date</th>\n <td>2016-02-06 00:00:00</td>\n </tr>\n <tr>\n <th>LPIS_install_month</th>\n <td>2016-02-01 00:00:00</td>\n </tr>\n <tr>\n <th>LPIS_install_year</th>\n <td>2,016.00</td>\n </tr>\n <tr>\n <th>LPIS_min</th>\n <td>19.07</td>\n </tr>\n <tr>\n <th>flag_LPIS_ever</th>\n <td>0</td>\n </tr>\n <tr>\n <th>flag_LPIS</th>\n <td>0</td>\n </tr>\n <tr>\n <th>bike_route_ever</th>\n <td>0</td>\n </tr>\n <tr>\n <th>bike_route_tv</th>\n <td>0</td>\n </tr>\n <tr>\n <th>truck_route</th>\n <td>0</td>\n </tr>\n <tr>\n <th>collision_count</th>\n <td>0</td>\n </tr>\n <tr>\n <th>latenight_collision_count</th>\n <td>0</td>\n </tr>\n <tr>\n <th>day_collision_count</th>\n <td>0</td>\n </tr>\n <tr>\n <th>personsinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>personskilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>pedestriansinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>pedestrianskilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>cyclistinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>cyclistkilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>motoristinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>motoristkilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>latenight_personsinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>latenight_personskilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>latenight_pedestriansinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>latenight_pedestrianskilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>latenight_cyclistinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>latenight_cyclistkilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>latenight_motoristinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>latenight_motoristkilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>day_personsinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>day_personskilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>day_pedestriansinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>day_pedestrianskilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>day_cyclistinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>day_cyclistkilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>day_motoristinjured</th>\n <td>0</td>\n </tr>\n <tr>\n <th>day_motoristkilled</th>\n <td>0</td>\n </tr>\n <tr>\n <th>flag_collision</th>\n <td>0.00</td>\n </tr>\n <tr>\n <th>latenight_flag_collision</th>\n <td>0.00</td>\n </tr>\n <tr>\n <th>day_flag_collision</th>\n <td>0.00</td>\n </tr>\n <tr>\n <th>group</th>\n <td>NaN</td>\n </tr>\n <tr>\n <th>group_filled</th>\n <td>NaN</td>\n </tr>\n <tr>\n <th>Coordinates</th>\n <td>POINT (-73.85148400285885 40.90959199994354)</td>\n </tr>\n <tr>\n <th>longitude</th>\n <td>-73.85</td>\n </tr>\n <tr>\n <th>latitude</th>\n <td>40.91</td>\n </tr>\n </tbody>\n</table>\n</div>",
"text/plain": " 0\nintersection_id 1\nmonth 7\nyear 2012\nmonthly 2012-07-01 00:00:00\nft_lon 1,025,301.15\nft_lat 270,701.93\nbname Bronx\nbronx 1\nbrooklyn 0\nmanhattan 0\nqueens 0\nstatenisland 0\ninters_to_sch_dist 2778\ninters_to_hosp_dist 6439\nlionstreet_id 29364\ntrafdir Two-Way: Traffic flows in both directions\nstreetwidth_min 24.00\nstreetwidth_max 30.00\nstreetwidth_irr NaN\nposted_speed 25.00\nnumber_travel_lanes 1.00\nmi_number_travel_lanes 0\nnumber_park_lanes 0\nmi_number_park_lanes 1\nnumber_total_lanes 1.00\nbike_route_id 250,515.00\nbike_route_to_coli_dist 3,303.43\nbike_route_install_dt NaT\nlanecount NaN\nbike_route_modif_dt NaT\nonoffst \nbike_route_install_mt NaT\nbike_route_modif_mt NaT\ntruck_route_id 5383\ntruck_route_to_coli_dist 1,293.15\nbikelane_type NaN\nroutetype Local\nmonthly_avg_drybulbtemp 78.68\nmonthly_avg_precip 0.14\nmonthly_avg_snowfall 0.00\nmonthly_tot_drybulbtemp 2,439.00\nmonthly_tot_precip 4.21\nmonthly_tot_snowfall 0.00\nleft_turn_id 104\nintersection_to_left_turn_dist 4,050.51\nleft_turn_install_dt 2017-12-09 00:00:00\nleft_turn_install_month 2017-12-01 00:00:00\nleft_turn_treatment Daylighting, Box markings, Pegatracks, Delinea...\nleft_turn_min 0.19\nflag_left_turn_ever 0\nflag_left_turn 0\nstreet_improv_id 173\nintersec_to_street_improv_dist 8,243.45\nstreet_improv_treatment \nstreet_improv_install_dt 2017-12-31 00:00:00\nstreet_improv_install_month 2017-12-01 00:00:00\nstreet_improv_min 395.85\nflag_street_improv_ever 0\nflag_street_improv 0\nLPIS_id 1767\nintersection_to_LPIS_dist 2,251.85\nLPIS_install_date 2016-02-06 00:00:00\nLPIS_install_month 2016-02-01 00:00:00\nLPIS_install_year 2,016.00\nLPIS_min 19.07\nflag_LPIS_ever 0\nflag_LPIS 0\nbike_route_ever 0\nbike_route_tv 0\ntruck_route 0\ncollision_count 0\nlatenight_collision_count 0\nday_collision_count 0\npersonsinjured 0\npersonskilled 0\npedestriansinjured 0\npedestrianskilled 0\ncyclistinjured 0\ncyclistkilled 0\nmotoristinjured 0\nmotoristkilled 0\nlatenight_personsinjured 0\nlatenight_personskilled 0\nlatenight_pedestriansinjured 0\nlatenight_pedestrianskilled 0\nlatenight_cyclistinjured 0\nlatenight_cyclistkilled 0\nlatenight_motoristinjured 0\nlatenight_motoristkilled 0\nday_personsinjured 0\nday_personskilled 0\nday_pedestriansinjured 0\nday_pedestrianskilled 0\nday_cyclistinjured 0\nday_cyclistkilled 0\nday_motoristinjured 0\nday_motoristkilled 0\nflag_collision 0.00\nlatenight_flag_collision 0.00\nday_flag_collision 0.00\ngroup NaN\ngroup_filled NaN\nCoordinates POINT (-73.85148400285885 40.90959199994354)\nlongitude -73.85\nlatitude 40.91"
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### Exploring Jackson Heights New York\n\nI live in this neighborhood, so let's explore the situation inside [this box](https://www.openstreetmap.org/export#map=16/40.7535/-73.8844) conveniently mapped for export at OpenStreetMap (and they even give the lat/lon bounds of the box which we'll use below)\n\nLet's create a dataframe with just focus the intersections in this neighborhood."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "jhmask = ( (gdf.longitude > -73.8987) & (gdf.longitude < -73.8701) & (gdf.latitude > 40.7470) & (gdf.latitude < 40.7600) )\njh = gdf[jhmask]\njh.shape",
"execution_count": 20,
"outputs": [
{
"data": {
"text/plain": "(20400, 105)"
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "It will be useful to also have a data frame that 'collapses' (using Stata language) the data by `intersection_id` to leave us one observation per intersection, for easy mapping etc. This is a groupby in pandas.\n\nNote how in pandas we can first create a dictionary mapping each column to the aggregation we want. In some cases (e.g. `intersection_id`) we just need the first or last observation, in others like `personsinjured` we want a sum. Then with that dictionary we can create the new dataframe in one single groupby and aggregate operation.\n\n"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "lastcols = ['intersection_id', 'flag_LPIS_ever', 'latitude','longitude'] \nsumcols = ['collision_count', 'personsinjured', 'personskilled', 'pedestriansinjured'] \nallcols = lastcols + sumcols\nmethods = len(lastcols)*['last'] + len(allcols)*['sum']\naggdict = dict(zip(allcols, methods))\naggdict",
"execution_count": 28,
"outputs": [
{
"data": {
"text/plain": "{'intersection_id': 'last',\n 'flag_LPIS_ever': 'last',\n 'latitude': 'last',\n 'longitude': 'last',\n 'collision_count': 'sum',\n 'personsinjured': 'sum',\n 'personskilled': 'sum',\n 'pedestriansinjured': 'sum'}"
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "jh0 = jh.groupby('intersection_id')[allcols].agg(aggdict)\njh0.shape",
"execution_count": 32,
"outputs": [
{
"data": {
"text/plain": "(272, 8)"
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "jh0.head()",
"execution_count": 35,
"outputs": [
{
"data": {
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>intersection_id</th>\n <th>flag_LPIS_ever</th>\n <th>latitude</th>\n <th>longitude</th>\n <th>collision_count</th>\n <th>personsinjured</th>\n <th>personskilled</th>\n <th>pedestriansinjured</th>\n </tr>\n <tr>\n <th>intersection_id</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>54928</th>\n <td>54928</td>\n <td>0</td>\n <td>40.75</td>\n <td>-73.90</td>\n <td>0.00</td>\n <td>0</td>\n <td>0</td>\n <td>0</td>\n </tr>\n <tr>\n <th>54929</th>\n <td>54929</td>\n <td>1</td>\n <td>40.75</td>\n <td>-73.90</td>\n <td>156.00</td>\n <td>49</td>\n <td>0</td>\n <td>3</td>\n </tr>\n <tr>\n <th>54930</th>\n <td>54930</td>\n <td>0</td>\n <td>40.75</td>\n <td>-73.90</td>\n <td>67.00</td>\n <td>18</td>\n <td>1</td>\n <td>4</td>\n </tr>\n <tr>\n <th>54931</th>\n <td>54931</td>\n <td>0</td>\n <td>40.75</td>\n <td>-73.90</td>\n <td>44.00</td>\n <td>14</td>\n <td>0</td>\n <td>1</td>\n </tr>\n <tr>\n <th>54932</th>\n <td>54932</td>\n <td>0</td>\n <td>40.75</td>\n <td>-73.89</td>\n <td>34.00</td>\n <td>14</td>\n <td>0</td>\n <td>3</td>\n </tr>\n </tbody>\n</table>\n</div>",
"text/plain": " intersection_id flag_LPIS_ever latitude longitude \\\nintersection_id \n54928 54928 0 40.75 -73.90 \n54929 54929 1 40.75 -73.90 \n54930 54930 0 40.75 -73.90 \n54931 54931 0 40.75 -73.90 \n54932 54932 0 40.75 -73.89 \n\n collision_count personsinjured personskilled \\\nintersection_id \n54928 0.00 0 0 \n54929 156.00 49 0 \n54930 67.00 18 1 \n54931 44.00 14 0 \n54932 34.00 14 0 \n\n pedestriansinjured \nintersection_id \n54928 0 \n54929 3 \n54930 4 \n54931 1 \n54932 3 "
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "So we have a mix of LPIS and non-LPIS intersections. Clearly the LPIS intersections are ones that were/are more prone to accidents."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "(jh0.groupby('flag_LPIS_ever')[['intersection_id','personsinjured', 'personskilled']]\n .agg({'intersection_id': 'count', 'personsinjured' : 'mean', 'personskilled' : 'mean'}) )",
"execution_count": 50,
"outputs": [
{
"data": {
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>intersection_id</th>\n <th>personsinjured</th>\n <th>personskilled</th>\n </tr>\n <tr>\n <th>flag_LPIS_ever</th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>189</td>\n <td>5.71</td>\n <td>0.04</td>\n </tr>\n <tr>\n <th>1</th>\n <td>83</td>\n <td>11.05</td>\n <td>0.06</td>\n </tr>\n </tbody>\n</table>\n</div>",
"text/plain": " intersection_id personsinjured personskilled\nflag_LPIS_ever \n0 189 5.71 0.04\n1 83 11.05 0.06"
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Let's look at the distribution of injuries and the intersections most prone to creating injuries (larger circles on the map)."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "plt.title('Total Persons injured over entire period')\njh0.personsinjured.hist(bins=[1,2,3,4,5,10, 15, 20, 30, 50]);",
"execution_count": 46,
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": "<Figure size 432x288 with 1 Axes>"
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "m = folium.Map(location=[40.75, -73.88], zoom_start =15, tiles='OpenStreetMap')\n\nfor idx, row in jh0.iterrows():\n if (row.flag_LPIS_ever == 1):\n folium.Circle(location=[row.latitude, row.longitude],\n popup=f'LPIS {row.intersection_id}',\n radius = 10 + row.personsinjured*3\n ).add_to(m)\n else:\n folium.Circle(location=[row.latitude, row.longitude],\n popup=f'{row.intersection_id}',\n radius = 10 + row.personsinjured*3,\n color = 'grey',\n ).add_to(m)\nm",
"execution_count": 95,
"outputs": [
{
"data": {
"text/html": "<div style=\"width:100%;\"><div style=\"position:relative;width:100%;height:0;padding-bottom:60%;\"><iframe src=\"data:text/html;charset=utf-8;base64,\" style=\"position:absolute;width:100%;height:100%;left:0;top:0;border:none !important;\" allowfullscreen webkitallowfullscreen mozallowfullscreen></iframe></div></div>",
"text/plain": "<folium.folium.Map at 0x21831788fd0>"
},
"execution_count": 95,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Heatmap\nSimilar to above but just the LPIS locations on top of a heatmap."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "mh = folium.Map(location=[40.75, -73.88], zoom_start =15, tiles='OpenStreetMap')\n\nHeatMap(list(zip(jh0.latitude, jh0.longitude, jh0.personsinjured)), radius = 12).add_to(mh) \n\nfor idx, row in jh0.iterrows():\n if (row.flag_LPIS_ever == 1):\n folium.Circle(location=[row.latitude, row.longitude],\n popup=f'LPIS {row.intersection_id}',\n radius = 10 + row.personsinjured*3\n ).add_to(mh)\n\nmh",
"execution_count": 113,
"outputs": [
{
"data": {
"text/html": "<div style=\"width:100%;\"><div style=\"position:relative;width:100%;height:0;padding-bottom:60%;\"><iframe src=\"data:text/html;charset=utf-8;base64,\" style=\"position:absolute;width:100%;height:100%;left:0;top:0;border:none !important;\" allowfullscreen webkitallowfullscreen mozallowfullscreen></iframe></div></div>",
"text/plain": "<folium.folium.Map at 0x21836f63b38>"
},
"execution_count": 113,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "### A slippy map of all LPIS intersections\n\nWe use the Folium MarkerCluster option so that the markers 'cluster' when we zoom out rather than overwhelm the map."
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Collapse the entire panel to a dataframe with just LPIS intersections. We can use the same columns and dictionary as used above."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "gdf0 = gdf.groupby('intersection_id')[allcols].agg(aggdict)",
"execution_count": 52,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "A mask to just select the LPIS intersections"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "LPIS = (gdf0.flag_LPIS_ever == 1)",
"execution_count": 53,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "gdf0[LPIS].shape",
"execution_count": 54,
"outputs": [
{
"data": {
"text/plain": "(2126, 8)"
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Markers for the LPIS intersections"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "locations = list(zip(gdf0[LPIS].latitude, gdf0[LPIS].longitude,))\nlocations[:4]",
"execution_count": 55,
"outputs": [
{
"data": {
"text/plain": "[(40.7543329998393, -73.82334300046672),\n (40.81636099985319, -73.9540890000869),\n (40.73378299979906, -73.8148000031035),\n (40.74347199986652, -73.97675800003843)]"
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
]
},
{
"metadata": {
"scrolled": false,
"trusted": true
},
"cell_type": "code",
"source": "map2 = folium.Map(location=[40.7128, -74.006], zoom_start =13, tiles='OpenStreetMap')\nmarker_cluster = MarkerCluster().add_to(map2)",
"execution_count": 51,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "for idx, row in gdf0[LPIS].iterrows():\n folium.Marker(location=[row.latitude, row.longitude],\n popup=f'ID={row.intersection_id}, inj ={row.personsinjured}'\n ).add_to(marker_cluster)\n ",
"execution_count": 58,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "map2.save('lpis.html')",
"execution_count": 59,
"outputs": []
},
{
"metadata": {
"scrolled": false,
"trusted": true
},
"cell_type": "code",
"source": "map2",
"execution_count": 62,
"outputs": [
{
"data": {
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment