Mapping the Toronto Neighbourhoods with Geopandas and Folium

The full code can be found in the gitub repo: toronto_neighbourhoods

Introduction

A long time ago, this post appeared on my everyday analytics blog in 2016. Since then, the code in R no longer works, as Google Maps now requires an API key, and there have been some significant changes to both ggmap and the maptools / rgdal packages.

As such, I decided to revisit this. As it turns out, performing the same types of tasks now is actually much easier with geopandas. Interactive mapping with Folium, which creates Javascript-based Leaflet maps is a little trickier, and there are some good resources I relied on here to which I’ll also point back.

First, we will import the libraries we need:

import pandas as pd
import matplotlib.pyplot as plt
import geopandas 
import folium

Luckily, I saved the original shapefiles and data in the github repo, as they no longer appear to be available at their original sources. Though I’m sure the same data is out there in many other places, I’ll use what’s saved there to make this a 1-1 reproduction of what was done in R as much as possible.

Static Choropleths using Geopandas

First, we’ll reproduce a simple plot of the neighbourhood boundaries for Toronto. We read the data into geopandas, just a we would into regular pandas if it were a csv:

gdf = geopandas.read_file("NEIGHBORHOODS_WGS84_2.shp")

gdf.head()
AREA_S_CD AREA_NAME geometry
0 097 Yonge-St.Clair (97) POLYGON ((-79.39119 43.68108, -79.39141 43.680...
1 027 York University Heights (27) POLYGON ((-79.50529 43.75987, -79.50488 43.759...
2 038 Lansing-Westgate (38) POLYGON ((-79.43998 43.76156, -79.44004 43.761...
3 031 Yorkdale-Glen Park (31) POLYGON ((-79.43969 43.70561, -79.44011 43.705...
4 016 Stonegate-Queensway (16) POLYGON ((-79.49262 43.64744, -79.49277 43.647...

The code for creating this visual is incredibly straightforward:

# Read the neighborhood shapefile data and plot
gdf.plot(color="white", edgecolor="black")
plt.show()

png

Next, adding the demographic data. This is read into pandas as it is just in a csv:

# Read the demographic data
demo = pd.read_csv("WB-Demographics.csv")
demo.head()
Neighbourhood Neighbourhood Id Total Area Total Population Pop - Males Pop - Females Pop 0 - 4 years Pop 5 - 9 years Pop 10 - 14 years Pop 15 -19 years ... Language - Chinese Language - Italian Language - Korean Language - Persian (Farsi) Language - Portuguese Language - Russian Language - Spanish Language - Tagalog Language - Tamil Language - Urdu
0 West Humber-Clairville 1 30.09 34100 17095 17000 1865 1950 2155 2550 ... 475 925 95 160 205 15 1100 850 715 715
1 Mount Olive-Silverstone-Jamestown 2 4.60 32790 16015 16765 2575 2535 2555 2620 ... 275 750 60 350 115 50 820 345 1420 1075
2 Thistletown-Beaumond Heights 3 3.40 10140 4920 5225 575 580 670 675 ... 95 705 35 115 105 15 570 130 120 300
3 Rexdale-Kipling 4 2.50 10485 5035 5455 495 520 570 665 ... 95 475 30 95 145 30 700 180 70 215
4 Elms-Old Rexdale 5 2.90 9550 4615 4935 670 720 720 725 ... 90 510 55 285 80 30 670 195 60 140

5 rows × 39 columns

Now will merge the two on the neighbourhood id. In the original data this is the field AREA_S_CD, which was stored as a string, so we first convert to integer and then join the two dataframes:

# Convert neighborhood code to int for joining
gdf['AREA_S_CD'] = gdf['AREA_S_CD'].astype(int)

# Merge on Neighborhood ID
gdf2 = pd.merge(gdf, demo, left_on="AREA_S_CD", right_on="Neighbourhood Id")

gdf2.head()
AREA_S_CD AREA_NAME geometry Neighbourhood Neighbourhood Id Total Area Total Population Pop - Males Pop - Females Pop 0 - 4 years ... Language - Chinese Language - Italian Language - Korean Language - Persian (Farsi) Language - Portuguese Language - Russian Language - Spanish Language - Tagalog Language - Tamil Language - Urdu
0 97 Yonge-St.Clair (97) POLYGON ((-79.39119 43.68108, -79.39141 43.680... Yonge-St.Clair 97 1.2 11655 5235 6420 395 ... 250 120 55 120 55 180 230 95 10 10
1 27 York University Heights (27) POLYGON ((-79.50529 43.75987, -79.50488 43.759... York University Heights 27 13.2 27715 13580 14125 1645 ... 1750 2280 265 370 285 430 1475 930 1175 580
2 38 Lansing-Westgate (38) POLYGON ((-79.43998 43.76156, -79.44004 43.761... Lansing-Westgate 38 5.4 14640 6950 7695 845 ... 995 165 580 630 90 770 320 460 15 45
3 31 Yorkdale-Glen Park (31) POLYGON ((-79.43969 43.70561, -79.44011 43.705... Yorkdale-Glen Park 31 5.9 14685 6750 7940 640 ... 640 2950 50 70 810 85 625 605 120 60
4 16 Stonegate-Queensway (16) POLYGON ((-79.49262 43.64744, -79.49277 43.647... Stonegate-Queensway 16 7.9 24690 11935 12745 1395 ... 330 755 135 65 555 635 455 260 25 35

5 rows × 42 columns

To create a static choropleth in geopandas, the call is, again, very straightforward - you only need add the field which is used for colour (here, Total Population).

gdf2.plot('Total Population', cmap='Reds', figsize=(10, 10), legend=True)
plt.axis('off')
plt.title('Toronto Neighborhoods by Population')
plt.show()

png

In my original post, I used RColorBrewer to create a spectral palette. There’s no need to do that here, as the spectral palette is built into matplotlib.

gdf2.plot('Total Population', cmap='Spectral_r', figsize=(10, 10), legend=True)
plt.axis('off')
plt.title('Toronto Neighborhoods by Population')
plt.show()

png

In retrospect, from a visualization perspective, I think the first visual is actually superior; the spectral palette should be avoided in general due to accessibility issues (i.e. for those who are colorblind) and in this particular case is a diverging palette when we are not comparing values around a meaningful central point. A sequential palette such as in the first visual would be the way to go.

Interactive Choropleth with Tooltips using Folium

In my original post, I used ggmap to create an interactive Google Map in R. Since then, as the github maintainer notes, it’s no longer possible to do so without an API key which requires a credit card and setting up as part of GCP.

Instead here, to reproduce the result we’ll be using Folium to create an interactive map in Leaflet.

First so the variable names play nice with Javascript, we’ll remove the spaces in the two we need:

# Rename variable names to remove spaces
demo = demo.rename(columns={'Neighbourhood Id':'neighbourhood_id', 'Total Population':'total_pop'})

demo.head()
Neighbourhood neighbourhood_id Total Area total_pop Pop - Males Pop - Females Pop 0 - 4 years Pop 5 - 9 years Pop 10 - 14 years Pop 15 -19 years ... Language - Chinese Language - Italian Language - Korean Language - Persian (Farsi) Language - Portuguese Language - Russian Language - Spanish Language - Tagalog Language - Tamil Language - Urdu
0 West Humber-Clairville 1 30.09 34100 17095 17000 1865 1950 2155 2550 ... 475 925 95 160 205 15 1100 850 715 715
1 Mount Olive-Silverstone-Jamestown 2 4.60 32790 16015 16765 2575 2535 2555 2620 ... 275 750 60 350 115 50 820 345 1420 1075
2 Thistletown-Beaumond Heights 3 3.40 10140 4920 5225 575 580 670 675 ... 95 705 35 115 105 15 570 130 120 300
3 Rexdale-Kipling 4 2.50 10485 5035 5455 495 520 570 665 ... 95 475 30 95 145 30 700 180 70 215
4 Elms-Old Rexdale 5 2.90 9550 4615 4935 670 720 720 725 ... 90 510 55 285 80 30 670 195 60 140

5 rows × 39 columns

Next we’ll create the map. Creating a choropleth map with no interactivity is actually fairly straightforward. Adding interactivity is a little more involved. There are some good resources on this here:

I found the latter of the two to be simpler and so went with their approach for adding tooltips. First we need to create a new column in the geopandas dataframe with the strings to appear in the tooltip. I’ll do this using apply and a lambda function to join the Neighbourhood and Total Population columns:

# Prepare tooltip - combination of neighbourhood and population columns in a string
gdf2['tooltip'] = gdf2.apply(lambda x: x['Neighbourhood'] + ' ' + str(x['Total Population']), axis=1)

# Check
gdf2['tooltip'].head()
0             Yonge-St.Clair 11655
1    York University Heights 27715
2           Lansing-Westgate 14640
3         Yorkdale-Glen Park 14685
4        Stonegate-Queensway 24690
Name: tooltip, dtype: object

Then we create the map, add the choropleth layer, and finally update the layer by adding the tooltip:

# Base map
tdot_map = folium.Map(location=[43.7, -79.4], zoom_start=11)

# Choropleth
choropleth = folium.Choropleth(geo_data=gdf2, 
                  data=demo, 
                  columns=['neighbourhood_id', 'total_pop'], 
                  key_on="feature.properties.AREA_S_CD",).add_to(tdot_map)

# Add tooltip 
folium.LayerControl().add_to(tdot_map)
choropleth.geojson.add_child(
    folium.features.GeoJsonTooltip(['tooltip'], labels=False)
)

tdot_map

And that’s it! A beautiful interactive web-based map with just a little python.

I’ve added this content back to the original github repo, check out the code if you are interested.

Updated: