2024-09-28 04:10:24 +00:00
---
theme: custom-theme
---
# Chart Design
## CAPP 30239
---
## Today
- What general **principles of visual design** are relevant to our work?
- What are the **common types of charts** and how do we use them?
- When and how do we break the rules?
---
## Edward Tufte
### The Visual Display of Quantitative Information
data:image/s3,"s3://crabby-images/f3c03/f3c03c0582dfce404b8d5c5f4ee9238b8b5b5860" alt=""
---
## Key Ideas
- Graphical Integrity: Above all else, show the data.
- Maximize the data-ink ratio.
2024-09-28 19:45:22 +00:00
- Minimize chart junk.
2024-09-28 04:10:24 +00:00
- Aim for high chart density, consider *small multiples* .
- Revision & Editing are essential.
---
## Tufte's Principles for **Graphical Integrity**
---
1. The representation of numbers, as physically measured on the surface of the graphic itself, should be directly **proportional** to the numerical quantities represented.
data:image/s3,"s3://crabby-images/d4fd3/d4fd39065157e4bdf9fd93601a52c64d2265f438" alt=""
Mileage increase: 53%
Graph length increase: 783%
"Lie Factor": 14.8x
---
2. Clear, detailed and thorough **labeling** should be used to defeat graphical distortion and ambiguity.
data:image/s3,"s3://crabby-images/9914d/9914dfa936daaec2345ec17da3aedb36b4eac01c" alt="bg left "
How many children get a spinal injury every year? (out of 74,000,000 children in US)
2024-10-05 03:56:07 +00:00
Note: there are only 22,000 total spinal cord injuries a year in America (and most are 16-30yo).
2024-09-28 04:10:24 +00:00
<!-- .0000003% -->
---
3. Write out explanation of the data on the graphic itself. **Label important events** in the data.
2024-10-05 03:56:07 +00:00
data:image/s3,"s3://crabby-images/73658/7365821a31b1b216cf19b9fdc6cd7c75c333b67f" alt="bg right width:600px "
2024-09-28 04:10:24 +00:00
---
4. Show **data variation, not design variation** .
Deflated & standardized units of money are almost almost superior to nominal units.
The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data. (roughly 1:1 channel mapping)
Exception: It is OK/common to pair color & shape, or for print color & texture to address issues that color presents.
---
2024-10-05 03:56:07 +00:00
## Axes/Scale Mistakes
- Carefully consider not starting at zero.
- Beware dual axes.
- Consider audience when using log scale.
- Related: No pie charts that don't add up to 100%
---
data:image/s3,"s3://crabby-images/3b490/3b490b64b7d26e5344c249869b687eb94c7d7af2" alt="bg fit "
---
data:image/s3,"s3://crabby-images/f8a20/f8a206366f30c27028bc69fd38ac93d970d3ce99" alt="bg fit "
---
data:image/s3,"s3://crabby-images/f3dfc/f3dfc40394c5cb0c27700fd28c317f197019507c" alt="bg vertical fit "
2024-10-02 22:43:38 +00:00
2024-10-05 03:56:07 +00:00
---
data:image/s3,"s3://crabby-images/c52aa/c52aab7211b0163eb5c24a14cb1ad1e8a0f64212" alt="bg fit "
2024-10-02 22:43:38 +00:00
---
2024-09-28 04:10:24 +00:00
## Data-Ink Ratio
- **Data-ink**: Ink (pixels) used to show data.
- Data-ink ratio: data-ink / total-ink
data:image/s3,"s3://crabby-images/c7c53/c7c534b72ccf49ce3a54ae9ccff2039737855e81" alt=""
---
data:image/s3,"s3://crabby-images/a19a6/a19a6a3f19a24b4f5472960d8a3c614a624e8cbe" alt=""
data:image/s3,"s3://crabby-images/3a102/3a102b3504174eb181865e7c22a1ea79ccd553ef" alt="bg right width:600px "
---
## Optimizing Data Density
Number of entries in DataFrame / Area of Graphic.
Classic example of high data density is the sparkline, which can fit on a line of text.
data:image/s3,"s3://crabby-images/5c429/5c4295fa0cd1246e7cc7eda80fea0bffb5c684fa" alt=""
---
data:image/s3,"s3://crabby-images/4f3ad/4f3adfd442da4dac89e732b389b2faeb7bc06eaf" alt="bg left height:700px "
## Chart Junk
Anything that isn't relevant to understanding the data.
---
data:image/s3,"s3://crabby-images/48dc9/48dc9bba29ddb673e3f45fc2e7b2ed17f293233d" alt=""
via junkcharts.typepad.com
---
## Common Chart Types
2024-09-28 19:45:22 +00:00
---
2024-09-28 21:51:09 +00:00
## How to Pick?
- Quantitative / Quantitative:
- Quantitative / Temporal:
- Quantitative / Nominal:
- Nominal / Nominal:
---
2024-09-28 04:10:24 +00:00
### Bar Charts & Histograms
2024-09-28 19:45:22 +00:00
- X/Y: Nominal (Binned Numerical - Histogram)
- Y/X: Quantitative
2024-10-05 03:56:07 +00:00
- Area must be relevant on bar charts: no log scales/cut axes!
2024-09-28 19:45:22 +00:00
data:image/s3,"s3://crabby-images/4df8c/4df8c53ead55c647ce73a48d35d43183d508492b" alt=""
---
2024-09-28 04:10:24 +00:00
### Line & Area Charts
2024-09-28 21:51:09 +00:00
- X: Temporal / Quantitative
- Y: Quantitative (means / sums)
2024-09-28 19:45:22 +00:00
data:image/s3,"s3://crabby-images/630c2/630c22499e491c06b436ab83dd3846a1de43d2a1" alt="bg right width:600px "
---
### When to use stacked area charts?
data:image/s3,"s3://crabby-images/ce1b9/ce1b990a45ae72bb5a4595562bc07f2d50afa200" alt="bg left width:600px "
2024-09-28 21:51:09 +00:00
Sum of stacked axis variable **must have meaning** .
2024-09-28 19:45:22 +00:00
---
### Heatmap
data:image/s3,"s3://crabby-images/9eedc/9eedc46ae31bbcc4a6257aa1111c2eab299fd503" alt="bg right width:600px "
- X & Y: Quantitative or Nominal
- Color: Quantitative
- `mark_rect`
---
### Strip Plot
data:image/s3,"s3://crabby-images/f1798/f1798ce3870efd0692a86c7a4d3072979ccbc74d" alt="bg left width:600px "
2024-09-28 04:10:24 +00:00
2024-09-28 19:45:22 +00:00
- Y: Nominal
- X: Temporal or Quantitative
- Color: Optional (any type)
- `mark_tick`
---
2024-09-28 04:10:24 +00:00
### Pie / Donut / Radial Charts
2024-10-05 03:56:07 +00:00
data:image/s3,"s3://crabby-images/f4913/f49136743762193b823ea715efc01092b0a21cc6" alt="bg right fit "
2024-09-28 19:45:22 +00:00
Theta: Quantitative (ratio)
Color: Nominal
Direct comparison of segments is very difficult at n > 2.
Only use when most important information is ratio between sizes, and relatively few categories.
2024-10-05 03:56:07 +00:00
**Must add up to 100%**
2024-09-28 19:45:22 +00:00
---
data:image/s3,"s3://crabby-images/24c79/24c795daa437435a7002084490307050b96f85ac" alt=""
https://www.storytellingwithdata.com/blog/2020/5/14/what-is-a-pie-chart
---
### Bump / Rank Line Chart
2024-09-28 04:10:24 +00:00
2024-09-28 19:45:22 +00:00
data:image/s3,"s3://crabby-images/e4aa5/e4aa532e1d123f80001cc09543c7ffcd84c93916" alt="width:200px left "
data:image/s3,"s3://crabby-images/21ad1/21ad1450f0621c2ba7e785fdbabda4743c03ac28" alt="width:500px left "
Useful for showing changes in relative positioning.
Require some data manipulation using `transform_window` or pre-computing ranks. (see Altair gallery examples.)
---
### Scatter & Bubble Plots
2024-09-28 21:51:09 +00:00
data:image/s3,"s3://crabby-images/cc825/cc82553a751a3e31b8da2125de0942fa8705f420" alt="bg left width:600px "
2024-09-28 19:45:22 +00:00
- X / Y: Quantitative
Bubble charts use size as a 3rd dimension.
2024-09-28 21:51:09 +00:00
(Note subtle but useful transparency usage as well.)
2024-09-28 19:45:22 +00:00
---
2024-09-28 04:10:24 +00:00
### Small Multiples / Faceting
2024-09-28 21:51:09 +00:00
data:image/s3,"s3://crabby-images/13e66/13e665989e529451c1405f0ed259179eccc89d1f" alt="facet "
2024-09-28 19:45:22 +00:00
2024-10-05 03:56:07 +00:00
data:image/s3,"s3://crabby-images/db37a/db37ab1d1e42f8c1cfcb7c404efc745d1cab6df7" alt="bg right fit "
2024-09-28 19:45:22 +00:00
<!-- source: https://www.juiceanalytics.com/writing/better - know - visualization - small - multiples -->
2024-09-28 21:51:09 +00:00
Useful when there is a nominal variable being compared across two other dimensions.
2024-09-28 19:45:22 +00:00
---
2024-10-05 03:56:07 +00:00
data:image/s3,"s3://crabby-images/7e5cf/7e5cf8162b9d115a9e5173a1d3ee2b1f94c7e109" alt="bg fit "
<!-- source https://obamawhitehouse.archives.gov/interactive - budget -->
---
2024-09-28 04:10:24 +00:00
### Map Basics
2024-09-28 21:51:09 +00:00
Two most common:
- point maps
- choropleths
data:image/s3,"s3://crabby-images/27f3c/27f3cfb467b62e4af1179e3e32b010e91d4d2481" alt="bg left width:600px "
*Image: Trees in London, data.london.gov.uk*
<!-- source: https://data.london.gov.uk/dataset/local - authority - maintained - trees#:~:text=The%20data%20does%20not%20represent,streets%2C%20private%20gardens%20and%20more. -->
**We will revisit maps later in this course.**
---
## Two choropleths, same data.
2024-10-05 03:56:07 +00:00
data:image/s3,"s3://crabby-images/48190/48190e96139d9ff06c12ed4b1491ebf5a82eaf83" alt="bg right vertical width:600px "
2024-09-28 21:51:09 +00:00
2024-10-05 03:56:07 +00:00
data:image/s3,"s3://crabby-images/00814/0081443c0158e124ea73570f6373b1c65ab1fb73" alt="bg right width:600px "
2024-09-28 21:51:09 +00:00
<!-- source: https://carto.maps.arcgis.com/apps/webappviewer/index.html?id=7475c5788efe4c75a9642f552f61d568 -->
2024-10-05 03:56:07 +00:00
Color scale & unit of measurement is incredibly important.
2024-09-28 21:51:09 +00:00
Consider alternatives if district/population sizes vary significantly.
2024-09-28 04:10:24 +00:00
---
## When & How to Break the Rules
2024-09-28 19:45:22 +00:00
**When in doubt...**
2024-10-05 03:56:07 +00:00
9 out of 10 visualizations should be some variation of the common types.
This does not need to hamper creativity, in the right context a little flourish can add a lot. But ensure that it does not obfuscate the data.
Focus on Tufte's principles & ask for feedback!
2024-09-28 19:45:22 +00:00
---
### Case Study: Two Innovations
Two visualization types that have had their moment in the past 10-15 years:
- Hex/Grid Maps
- Word Clouds
---
## Grid Map
data:image/s3,"s3://crabby-images/b6e50/b6e504297ba96bb25fea22bf79b0e97e331d164a" alt=""
Introduced in < https: / / blog . apps . npr . org / 2015 / 05 / 11 / hex-tile-maps . html >
<!-- discuss: is this a good thing? -->
---
## Word Cloud
data:image/s3,"s3://crabby-images/88f28/88f281771b599f9fd08c23a28faf145edfbd1b14" alt=""
---
data:image/s3,"s3://crabby-images/ca597/ca597db317e3ef74fe7b09319682678dbb98e8bd" alt="bg left "
data:image/s3,"s3://crabby-images/f238d/f238da873150b89bf5476fb3bd646598ce3899b9" alt=""
Derived from same data as word cloud.
source: NYTimes via https://www.niemanlab.org/2011/10/word-clouds-considered-harmful/
---
## Narrative-supporting graphics
data:image/s3,"s3://crabby-images/a3793/a37931ad9339a5a68285b24f22b07b0bb227d019" alt="bg left width:500px "
by ulaniulani on flickr
---
2024-09-28 04:10:24 +00:00
### When it's OK to use 3D
2024-10-05 03:56:07 +00:00
You have data that relates to a spatial third dimension.
2024-09-28 19:45:22 +00:00
2024-10-05 03:56:07 +00:00
data:image/s3,"s3://crabby-images/44edf/44edff31d5799d500f04f248473fe739389077da" alt="bg vertical right "
data:image/s3,"s3://crabby-images/c3e7a/c3e7a2273e3827a7a6d54931f93588a6cc5fb207" alt="bg right fit "
2024-09-28 19:45:22 +00:00
2024-10-05 03:56:07 +00:00
(Image: Snowfall, NY Times)
2024-09-28 21:51:09 +00:00
(Image: CERN Large Hadron Collider)
2024-09-28 04:10:24 +00:00
---
## Acknowledgements & References
Thanks to Alex Hale, Andrew McNutt, and Jessica Hullman for sharing their materials.
- https://www2.cs.uh.edu/~ceick/NO/COSC3337-DV2.pdf
2024-09-28 19:45:22 +00:00
- Images from Tufte's Visual Display of Quantitative Information
- Images from Altair < https: // altair-viz . github . io / gallery / index . html >