Get you excited about storytelling with data
Show some tips and tricks to make your maps and charts pop
Improving your maps
Overcoming Excel
Telling a story with data
Reproducing figures for publication
Improving your maps
Legend breaks
Make a map of the share of employment in industry in the year 2010 across the whole dataset
01:00
Discuss with your neighbour:
What do we like?
What is confusing?
spmap employment_share_industry using "nutscoord.dta"
if year == 2010,
id(_ID) fcolor(Spectral) legstyle(2)
title("Employment Share Industry - 2010", size(large))
osize(0.02 ..) ocolor(white ..)
clmethod(custom) clbreaks(0 (0.2) 1)
legend(pos(9) size(medium) rowgap(1.5)
label(6 "80-100 %") label(5 "60-80 %")
label(4 "40-60 %") label(3 "20-40 %") label(2 "0-20 %")
label(1 "No Data"))
ndfcolor(gray) ndocolor(white ..) ndsize(0.02 ..)
spmap employment_share_industry using "nutscoord.dta"
if year == 2010, id(_ID) fcolor(Spectral) legstyle(2)
title("Employment Share Industry - 2010", size(large))
osize(0.02 ..) ocolor(white ..)
clmethod(custom) clbreaks(0 (0.075) 0.5)
legend(pos(9) size(medium) rowgap(1.5)
label(7 "37-45 %") label(6 "30-37 %")
label(5 "23-30 %") label(4 "15-23 %")
label(3 "8-15 %") label(2 "0-8 %")
label(1 "No Data"))
ndfcolor(gray) ndocolor(white ..) ndsize(0.02 ..)
Uses of color in data visualization
|
Palette name: Okabe-Ito
Palette name: Brewer Set1
Palette name: Brewer Dark2
|
|
|
Palette name: inferno
Palette name: viridis
|
|
|
|
|
|
|
|
|
|
|
|
Dataset: Solar panels in Sweden
Installed solar capacity in Sweden | |
Year: 2021 | |
Swedish county | Installed solar capacity (megawatts) |
---|---|
Västra Götalands län | 266.21 |
Skåne län | 256.25 |
Stockholms län | 182.25 |
Östergötlands län | 106.81 |
Hallands län | 94.31 |
Jönköpings län | 88.53 |
Södermanlands län | 79.71 |
Uppsala län | 79.11 |
Kalmar län | 59.01 |
Västmanlands län | 49.45 |
Source: Energimyndigheten |
Use a histogram or a density plot to see where the weight of the distribution is.
Ask your neighbour:
what kind of palette is this?
Is it appropriate to use with this data?
01:00
Improving your maps
Great Choropleths
The Coming Crisis: Exploring the U.S. Physician Shortage by Daniel Snow
Overcoming Excel
Motivation
Formby et al (2017) Microsoft Excel: Is It An Important Job Skill for College Graduates?
You will likely use Excel in the future 📊
Excel’s default plots and tables can be improved upon 📈
Simple rules can help you make your message clear 💎
Overcoming Excel
Charts
We often encounter datasets containing simple amounts 🤏
Here is some data on a sample of Swedish musical artists 🎵
I put this data into Excel, and asked for a recommended chart 📊
Swedish musical artists | ||
Rank | Artist | Monthly listeners (m) |
---|---|---|
1 | Avicii | 29.47 |
2 | ABBA | 23.48 |
3 | José González | 4.07 |
4 | Robyn | 3.11 |
5 | Timbuktu | 0.38 |
Datasource: Spotify charts Nov 2022 |
02:30
Discuss with your neighbour:
What do we like?
What is confusing?
Dataset: Solar panels in Sweden
Bar lengths do
not accurately
represent the
data values
Key features
of the data
are obscured
Overcoming Excel
Tables
We often encounter datasets containing simple amounts 🤏
Here is some data on a sample of Swedish musical artists 🎵
I put this data into Excel, and asked it to insert a table 🗃️
Swedish musical artists | ||
Rank | Artist | Monthly listeners (m) |
---|---|---|
1 | Avicii | 29.47 |
2 | ABBA | 23.48 |
3 | José González | 4.07 |
4 | Robyn | 3.11 |
5 | Timbuktu | 0.38 |
Datasource: Spotify charts Nov 2022 |
02:30
Discuss with your neighbour:
What do we like?
What is confusing?
Key rules for table layout | |
Number | Rule |
---|---|
1 | Do not use vertical lines. |
2 | Do not use heavy horizontal lines between data rows. (Horizontal lines as separator between the title row and the first data row or as frame for the entire table are fine.) |
3 | Text columns should be left aligned. |
4 | Number columns should be right aligned and should use the same number of decimal digits throughout. |
5 | Columns containing single characters are centred. |
6 | The header fields are aligned with their data, i.e., the heading for a text column will be left aligned and the heading for a number column will be right aligned. |
Source: Claus Wilke’s Fundamentals of Data Visualization |
01:30
Key rules for table layout | |
Number | Rule |
---|---|
1 | Do not use vertical lines. |
2 | Do not use heavy horizontal lines between data rows. (Horizontal lines as separator between the title row and the first data row or as frame for the entire table are fine.) |
3 | Text columns should be left aligned. |
4 | Number columns should be right aligned and should use the same number of decimal digits throughout. |
5 | Columns containing single characters are centred. |
6 | The header fields are aligned with their data, i.e., the heading for a text column will be left aligned and the heading for a number column will be right aligned. |
Source: Claus Wilke’s Fundamentals of Data Visualization |
01:30
Key rules for table layout | |
Number | Rule |
---|---|
1 | Do not use vertical lines. |
2 | Do not use heavy horizontal lines between data rows. (Horizontal lines as separator between the title row and the first data row or as frame for the entire table are fine.) |
3 | Text columns should be left aligned. |
4 | Number columns should be right aligned and should use the same number of decimal digits throughout. |
5 | Columns containing single characters are centred. |
6 | The header fields are aligned with their data, i.e., the heading for a text column will be left aligned and the heading for a number column will be right aligned. |
Source: Claus Wilke’s Fundamentals of Data Visualization |
Storytelling with data
Related time series
Dataset: Fertility and births outside of marriage in Denmark and Greece.
Default choice for plotting is two line plots
Hard to keep track of each series
Difficult to compare movements across short periods
Both countries saw a large drop in fertility from the 1960s until the 1980s
In Denmark, after 1970 we see an increase in the share of children born outside of marriage
In contrast, Greek families have relatively few children outside of marriage.
After 1990, Danish fertility increased from 1.3 to 1.8, while Greek fertility remained at ‘lowest-low’ levels, below replacement.
Indicators on the x- and y-axis and then show time with text labels
Legend is replaced with colour coded title
Colours have meaning (main colour of country flag)
Percentage labels on the y-axis
Storytelling with data
Giving context
Sometimes we may want to show a particular series of data in its correct context.
For instance, in our line graph above which showed the evolution of the share of births outside of marriage in Denmark and Greece, we might want to know if these two represent the extremes within Europe.
Do Denmark and Greece represent the extremes of the share of children born outside of marriage in Europe?
One way to do this would be to show an average for Europe
This is silly
Here we highlight the series we are interested in and draw in the remaining series in grey
Shows each of the series
We can see that Denmark is a leader in the beginning, but is caught up by other nations
Does not hide outliers
Makes clear the trends in your countries of interest
Storytelling with data
Tips for polished figures
Where to get great colours from for your plots: