14/15 edits

This commit is contained in:
James Turk 2024-12-03 13:36:40 -06:00
parent b95ebeedf4
commit ff05e37b59
5 changed files with 582 additions and 0 deletions

BIN
14.misc/mosaic.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 281 KiB

BIN
14.misc/plotly-dash.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 192 KiB

227
14.misc/slides.html Normal file

File diff suppressed because one or more lines are too long

236
14.misc/slides.md Normal file
View File

@ -0,0 +1,236 @@
# Data Visualization for Public Policy
---
## This Week
- Project Questions / Deployment
- Code Quality & Style
- Dashboards
- Visual Style Guides
- 100 Visualizations
- Animation & Interaction
---
## Code Quality
When you are writing a data pipeline or application, code quality is of high importance.
- Readability & Documentation => easier maintenance & bugs prevented.
- Small speed-ups from better algorithm/data structure choices can make big differences when that task executes millions of times.
- Test coverage makes refactoring easier, prevents regressions.
---
### Unique Considerations for Data Viz
- Typically little to no ongoing reuse/maintenance.
- Visualization itself unlikely to be performance bottleneck compared to data manipulation.
- Focus is on **immediate visual output**, testing de-emphasized.
- Often written by solo developer, even in larger organizations.
Code quality still matters, but your main goal should be code that you can trust is correct. Testing, documentation, and the "right way" are less essential.
---
## Dashboards
![bg fit left](plotly-dash.png)
Long-lived data visualizations that typically run against a central repository of data.
You can use the same techniques & tools, or custom dashboard-focused tools like Tableau or Dash.
Key difference: You will likely need some degree of dynamic refresh (instead of loading CSV/JSON load data from DB/API). Comes with caching and other performance considerations.
---
## Dashboard Psuedocode
```
every interval {
data = update()
visualize(data)
}
```
Can make use of animation to provide context:
- scrolling time series
- animated dials to show directional changes
---
## Are Dashboards Bad?
Dashboards saw a surge in popularity a decade or so ago, and there are now plenty of bad dashboards out there.
Golden rule of dashboards: **answer a question & make them actionable**.
Too often people just throw all their data on a dashboard.
*OK, I can see that 6 errors occurred in the last 24 hours...*
- Is that a lot? **Show trends where appropriate!**
- What can I do? **Provide links/action items!**
Without this focus, dashboards become decorations.
---
## Style Guide
It can be helpful to create or build from a style guide. Even for your own work.
Examples:
- [Sunlight Foundation](https://www.amycesal.com/portfolio#/data-visualization-style-guidelines/)
- [CFPB](https://www.amycesal.com/portfolio#/cfpb-design-manual-data-visualization/)
---
## Key Elements
### Typography
Select 2 complementary fonts:
- Prefer a very legible sans-serif font for data/axes labels.
- Any legible font for chart titles/narrative/etc.
### Color Selection
Best to have:
- Nominal data: Distinct, contrasting hues
- Quantitative data: Linear or divergent gradients
- Consider color-blindness and accessibility
---
## Style Guide: Chart Selection
- Match chart type to **data characteristics** and **audience**.
- Consider:
- Data dimensionality
- Comparison needs
- Narrative goals
---
## Creativity: 1 Dataset 100 Visualizations
<https://100.datavizproject.com>
---
## Applications of Animation
- Demonstrate change over time: Data being added to chart as time "plays."
- Highlight relationships: Hover/highlight/select modifies display of other data on page.
- Focus attention: Show subsets of data at a time.
- Show uncertainty: "wiggle", shifting trend line (next page)
More Examples:
- <https://informationisbeautiful.net>
- <https://www.visualcinnamon.com/portfolio/>
---
![bg fit](https://clauswilke.com/dataviz/visualizing_uncertainty_files/figure-html/mpg-uncertain-HOP-animated-1.gif)
---
## Applications of Interaction
- Enable user-driven exploration of data.
- "How do these two variables compare?"
- "What happens if this price increases?"
- Allow personalization (e.g. enter your zip code)
- "What is this like in my city?"
- Increased engagement/retention. Lots of evidence showing we learn best by participating.
---
## JS setInterval
```js
// will call `func` every `everyMS`
let intervalId = setInterval(func, everyMS)
// stop calling func
clearInterval(intervalId)
```
<https://developer.mozilla.org/en-US/docs/Web/API/Window/setInterval>
---
## Interaction: Making Data Selections
For user-driven data explorations, **selection** is an important concept.
How do you want to let a user select individual records or groups of records?
### Selection Spectrum: Simple to Complex
- Menu/Select Box
- Hover/click on items on page (tooltips, etc.)
- Drag/Region selection
- Pre-written SQL queries with dropdowns/selects. (Common on dashboards.)
- Allow user to write queries themselves in SQL or a custom query language. Common on advanced dashboards.
Altair Selection: <https://altair-viz.github.io/user_guide/interactions.html>
D3 Selection: <https://observablehq.com/collection/@d3/d3-selection>
---
## Discussion: Major Visualization Challenges
- Missing/Incomplete data
- Huge quantities of data
- Complex, high-dimensional data
- Uncertainty
- Challenges of Scale
---
### Missing/Incomplete Data
- Imputation of missing values.
- Label missing data.
- Regardless of choice. Be transparent.
---
### Big Data
- Aggregation
- Sampling
- Filtering/Interactives
---
### Lots of Attributes/Dimensions
- Small multiples approach
- Pairwise charts. (XY, YZ, XZ)
- Advanced: Dimensionality Reduction Algorithms (PCA, TSNE, etc.)
- Interactive exploration
---
### Handling Uncertainty
- Frequency Approach
- Confidence intervals & error bars
- Probabilistic visualizations
---
### Visualizing Scale
- Hierarchical visualizations (treemaps)
- Logarithmic scales when appropriate.

119
15.conclusion/slides.md Normal file
View File

@ -0,0 +1,119 @@
# Data Visualization for Public Policy
![bg fit right](mosaic.jpg)
---
## Recap: Why Do We Create Visualizations?
- To **better understand** large, complex datasets.
- To **influence others** through compelling, evidence-based storytelling.
---
## Influence: The Power of Visual Communication
Effective data visualizations can:
- Draw attention to critical problems or potential solutions.
- Argue for specific policy interventions.
- Connect an audience with large and potentially abstract data concepts.
---
## Key Ideas Exercise
What are *your* golden rules of data visualization?
---
## (Some) Key Rules for Effective Data Visualization
### 1. Audience-Centered Design
- Take time to consider and understand your audience's background, expertise, and information needs.
- The "best" data visualization is one that the audience understands & remembers.
---
### 2. **Prioritize Truthful Representation**
- Correct chart types & encodings.
- Never sacrifice **data integrity** in the name of a "better" chart.
- Avoid misleading choices: truncated axes, dual axes, etc.
- Consider the role of **uncertainity** in representing your data.
---
<!-- Mackinlay's Effectiveness Hierarchy-->
![bg fit](../01.gog-altair/effectiveness.png)
---
### 3. **Maximize Clarity and Comprehension**
- Simplify complex information where possible. It is OK to refer a user to a table or other source for deeper analysis.
- Remove unnecessary visual elements -- "chart junk"
- Guide the viewer's attention to key insights with **labeling**.
---
## Tufte's Key Ideas Revisited
![bg fit right](../03.charts/tufte.png)
- Graphical Integrity: Above all else, show the data.
- Maximize the data-ink ratio.
- Minimize chart junk.
- Aim for high chart density, consider *small multiples*.
- Revision & Editing are essential.
---
### 4. **Optimize for Accessibility**
- Use color-blind friendly palettes.
- Ensure readability for viewers with different visual capabilities. (Contrast,font size, etc.)
- Provide *alternative text descriptions* in web presentations.
- `<img src="..." alt="A graphic representing the length of rivers..." />`
- Accessibility tools: contrast/color/WCAG checkers.
---
### 5. **Build a Compelling Narrative**
- Create a clear, coherent story and use graphics to support it.
- Each chart should have a clear "why" -- don't make users wonder why you're showing them something.
- Use visual elements & conventions to guide the viewer through key arguments and order.
- Connect data to broader context and implications.
---
### 6. **Embrace Iterative Improvement**
- Seek feedback from diverse perspectives, especially those represented in your audience.
- Be willing to revise and refine, if someone had an issue others will too.
---
### 7. **Consider Ethical Implications**
- Represent marginalized groups respectfully: color choices, language.
- Remember that pixels often represent people, dismissing outliers/etc. should not be done without consideration.
- Be transparent about data sources and limitations.
- Use visualization as a tool for **understanding and persuasion, not manipulation**.
---
## Conclusion
Effective data visualization is both an art and a science.
Understand your data and what you people to understand.
Center your audience.
Prioritize clarity & truth.
Be creative & have fun!