TimeSliderChoropleth#

In this example we’ll make a choropleth with a timeslider.

The class needs at least two arguments to be instantiated.

  1. A string-serielized geojson containing all the features (i.e., the areas)

  2. A dictionary with the following structure:

styledict = {
    '0': {
        '2017-1-1': {'color': 'ffffff', 'opacity': 1},
        '2017-1-2': {'color': 'fffff0', 'opacity': 1},
        ...
        },
    ...,
    'n': {
        '2017-1-1': {'color': 'ffffff', 'opacity': 1},
        '2017-1-2': {'color': 'fffff0', 'opacity': 1},
        ...
        }
}

In the above dictionary, the keys are the feature-ids.

Using both color and opacity gives us the ability to simultaneously visualize two features on the choropleth. I typically use color to visualize the main feature (like, average height) and opacity to visualize how many measurements were in that group.

Loading the features#

We use geopandas to load a dataset containing the boundaries of all the countries in the world.

[1]:
import geopandas as gpd

assert "naturalearth_lowres" in gpd.datasets.available
datapath = gpd.datasets.get_path("naturalearth_lowres")
gdf = gpd.read_file(datapath)
/tmp/ipykernel_5929/2268878315.py:4: FutureWarning: The geopandas.dataset module is deprecated and will be removed in GeoPandas 1.0. You can get the original 'naturalearth_lowres' data from https://www.naturalearthdata.com/downloads/110m-cultural-vectors/.
  datapath = gpd.datasets.get_path("naturalearth_lowres")
[2]:
%matplotlib inline

ax = gdf.plot(figsize=(10, 10))
../../_images/user_guide_plugins_timeslider_choropleth_2_0.png

The GeoDataFrame contains the boundary coordinates, as well as some other data such as estimated population.

[3]:
gdf.head()
[3]:
pop_est continent name iso_a3 gdp_md_est geometry
0 889953.0 Oceania Fiji FJI 5496 MULTIPOLYGON (((180.00000 -16.06713, 180.00000...
1 58005463.0 Africa Tanzania TZA 63177 POLYGON ((33.90371 -0.95000, 34.07262 -1.05982...
2 603253.0 Africa W. Sahara ESH 907 POLYGON ((-8.66559 27.65643, -8.66512 27.58948...
3 37589262.0 North America Canada CAN 1736425 MULTIPOLYGON (((-122.84000 49.00000, -122.9742...
4 328239523.0 North America United States of America USA 21433226 MULTIPOLYGON (((-122.84000 49.00000, -120.0000...

Creating the style dictionary#

Now we generate time series data for each country.

Data for different areas might be sampled at different times, and TimeSliderChoropleth can deal with that. This means that there is no need to resample the data, as long as the number of datapoints isn’t too large for the browser to deal with.

To simulate that data is sampled at different times we random sample data for n_periods rows of data and then pick without replacing n_sample of those rows.

[4]:
import pandas as pd

n_periods, n_sample = 48, 40

assert n_sample < n_periods

datetime_index = pd.date_range("2016-1-1", periods=n_periods, freq="M")
dt_index_epochs = datetime_index.astype("int64") // 10 ** 9
dt_index = dt_index_epochs.astype("U10")

dt_index
/tmp/ipykernel_5929/3348968991.py:7: FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.
  datetime_index = pd.date_range("2016-1-1", periods=n_periods, freq="M")
[4]:
Index(['1454198400', '1456704000', '1459382400', '1461974400', '1464652800',
       '1467244800', '1469923200', '1472601600', '1475193600', '1477872000',
       '1480464000', '1483142400', '1485820800', '1488240000', '1490918400',
       '1493510400', '1496188800', '1498780800', '1501459200', '1504137600',
       '1506729600', '1509408000', '1512000000', '1514678400', '1517356800',
       '1519776000', '1522454400', '1525046400', '1527724800', '1530316800',
       '1532995200', '1535673600', '1538265600', '1540944000', '1543536000',
       '1546214400', '1548892800', '1551312000', '1553990400', '1556582400',
       '1559260800', '1561852800', '1564531200', '1567209600', '1569801600',
       '1572480000', '1575072000', '1577750400'],
      dtype='object')
[5]:
import numpy as np

styledata = {}

for country in gdf.index:
    df = pd.DataFrame(
        {
            "color": np.random.normal(size=n_periods),
            "opacity": np.random.normal(size=n_periods),
        },
        index=dt_index,
    )
    df = df.cumsum()
    df.sample(n_sample, replace=False).sort_index()
    styledata[country] = df

Note that the geodata and random sampled data is linked through the feature_id, which is the index of the GeoDataFrame.

[6]:
gdf.loc[0]
[6]:
pop_est                                                889953.0
continent                                               Oceania
name                                                       Fiji
iso_a3                                                      FJI
gdp_md_est                                                 5496
geometry      MULTIPOLYGON (((180 -16.067132663642447, 180 -...
Name: 0, dtype: object
[7]:
styledata.get(0).head()
[7]:
color opacity
1454198400 -0.073686 -1.167636
1456704000 2.278798 -0.898026
1459382400 1.161227 -0.822858
1461974400 2.563229 -0.073625
1464652800 3.470993 -0.330065

We see that we generated two series of data for each country; one for color and one for opacity. Let’s plot them to see what they look like.

[8]:
ax = df.plot()
../../_images/user_guide_plugins_timeslider_choropleth_12_0.png

Looks random alright. We want to map the column named color to a hex color. To do this we use a normal colormap. To create the colormap, we calculate the maximum and minimum values over all the timeseries. We also need the max/min of the opacity column, so that we can map that column into a range [0,1].

[9]:
max_color, min_color, max_opacity, min_opacity = 0, 0, 0, 0

for country, data in styledata.items():
    max_color = max(max_color, data["color"].max())
    min_color = min(max_color, data["color"].min())
    max_opacity = max(max_color, data["opacity"].max())
    max_opacity = min(max_color, data["opacity"].max())

Define and apply colormaps:

[10]:
from branca.colormap import linear

cmap = linear.PuRd_09.scale(min_color, max_color)


def norm(x):
    return (x - x.min()) / (x.max() - x.min())


for country, data in styledata.items():
    data["color"] = data["color"].apply(cmap)
    data["opacity"] = norm(data["opacity"])
[11]:
styledata.get(0).head()
[11]:
color opacity
1454198400 #d0acd3ff 0.135618
1456704000 #ca97c9ff 0.160082
1459382400 #cda1ceff 0.166903
1461974400 #c994c7ff 0.234887
1464652800 #ce8ac2ff 0.211618

Finally we use pd.DataFrame.to_dict() to convert each dataframe into a dictionary, and place each of these in a map from country id to data.

[12]:
styledict = {
    str(country): data.to_dict(orient="index") for country, data in styledata.items()
}

Creating the map#

[13]:
import folium
from folium.plugins import TimeSliderChoropleth


m = folium.Map([0, 0], zoom_start=2)

TimeSliderChoropleth(
    gdf.to_json(),
    styledict=styledict,
).add_to(m)

m
[13]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Initial timestamp#

By default the timeslider starts at the beginning. You can also select another timestamp to begin with using the init_timestamp parameter. Note that it expects an index to the list of timestamps. In this example we use -1 to select the last timestamp.

[14]:
m = folium.Map([0, 0], zoom_start=2)

TimeSliderChoropleth(
    gdf.to_json(),
    styledict=styledict,
    init_timestamp=-1,
).add_to(m)

m
[14]:
Make this Notebook Trusted to load map: File -> Trust Notebook