30239-notes/05.uncertainity/slides.md
2024-10-19 16:50:44 -05:00

5.0 KiB

Visualizing Uncertainty

CAPP 30239


bg fit


bg fit


What causes uncertainty?

  1. measurement error - An instrument used has some non-perfect degree of accuracy. In a survey, this could be a poorly-worded question.
  2. model uncertainty - Models make assumptions and simplifications, different assumptions lead to different outcomes.
  3. sampling variability - Differences between sample & population.
  4. missing data - How missing data is accounted for & represented.

The result is that we have a range or distribution, where we want a number to use with one of our channels (Hue, X, Y, etc.).


Challenges of Uncertainty

Often left out, in part due to being hard to understand, and even harder to visualize.

Omission however misleads audiences, especially where a lot of significant figures are included.

width:100%

Global Population Uncertainty: ±160 million people (2%)


Challenges of Uncertainty

Uncertainty estimates are simplified, often out of necessity.

30% chance of rain: "A 30% chance that at least 0.01" of rain will fall somewhere within a given area over a 12 hour period."

Do I bring an umbrella?


Challenges of Uncertainty

Complexity of visualization can overwhelm audience, obscure other meaning.

From a data-ink ratio perspective, it is understandable why if the error bars do not seem relevant to a narrative, that they would be omitted.


Including Uncertainty

If omitting uncertainty misleads, it violates our prime directive of graphical integrity.

The job then, is to find ways that are audience appropriate & don't obfuscate the meaning.

The difficulty will be in resolving this tension.


Common Techniques

  • Uncertainty as Probability
  • Error Bars
  • Confidence Bands

Uncertainty As Probability

width:100%

Random waffle chart: works for cases with discrete outcomes.


Uncertainty As Probability

In practice, we often care about more than boolean outcome.


bg fit


bg fit


Uncertainty of Point Estimates

These work when we're focused on uncertainty around a particular outcome.

Sometimes we need to show uncertainty around discrete measurements, or projections.


Error Bars


Error Bands

bg left

line = alt.Chart(source).mark_line().encode(
    x='Year',
    y='mean(Miles_per_Gallon)'
)

band = alt.Chart(source).mark_errorband(extent='ci').encode(
    x='Year',
    y=alt.Y('Miles_per_Gallon').title('Miles/Gallon'),
)

band + line

Issues with Error Bars & Confidence Bands

  1. There is no pre-defined meaning of these intervals. If error bars or bands are included, the legend must include information on the meaning.
  2. Error bars are common in scientific & academic literature, other audiences cannot be assumed to understand them.
  3. Restricted to 1D/2D dots. If variable being expressed is mapped to color, area, etc. then alternative presentations needed.

bg fit


Variations on Error Bars & Intervals


bg fit



Regression Uncertainty

width:900px


Regression Uncertainty

width:900px


Other Approaches


Showing Multiple Futures


width:900px


Hurricane Uncertainity

width:450

width:450px


On Maps

bg fit


"Sketchiness"


Animating Uncertainty


References & Acknowledgements