News and Notes from PolicyViz - Issue #22
In the previous newsletter, I shared a draft blog post with you about using dual axis charts, which I then posted to the PolicyViz site last week. In this newsletter, I'm sharing a new draft post on visualizing outliers. The post has been sitting in a folder on my desktop for months and I'm glad to share it with you now.
I've also been sharing some off-the-shelf stuff with my Winno community, so if you'd like to sign up--there's a free tier now!--see the links below.
In the next newsletter, I'll share details on a new book giveaway!
Thanks again,
Jon
DRAFT BLOG POST: Break-the-Frame: An Approach to Visualizing Outliers
A conversation that often pops up in visualizing data is what to do about outliers. How should we handle data values that are so much larger (or smaller) than the rest of the values in our data? In this post, I’m going to share a technique I’ve seen used recently that I think is a good approach to emphasizing and making outliers clear for your reader.
When I’m asked this question in workshops and classes, my usual response is to try two graphs—a sort-of “zoom in, zoom out” approach. As an example, take this bar chart of per capita health consumption expenditures across a sample of countries (data from the Kaiser Family Foundation’s Health System Tracker). The United States spends nearly $12,000 per person—about 60% more than the next wealthy country Switzerland and about 2.5 times as much as Japan—and is a clear outlier.
It’s difficult to make comparisons across the other values because the outlier extends the horizontal axis. Two graphs is one possible solution—the one above that shows all 12 countries and another that shows all but the United States.
Other, standard approaches—but always worth considering—is adding text or different colors to outliers to make them visible to the reader. The top graph here from CNN does a nice job of adding a label to the July 5th spike and the graph below from Chris Ingraham uses a purple color for Minnesota to help make it stand out from the rest of the country.
A Different Approach
I came across this bar chart of incarceration rates on the Prison Policy website. The incarceration rates for the United States as a whole and the state of Iowa are clear outliers relative to the other 11 countries included in the graph. Instead of using two graphs, they extended the graph outside the frame.
Now, usually, I prefer not to include this kind of unnecessary frame around my graphs—Excel, for example, includes a border around the graph, which I usually delete. But here, the frame becomes a useful aspect of the graph because it can be “broken” to extend the outlier values beyond.
Speaking of breaking, one approach I really want to recommend you avoid is the “breaking the bar” technique. With this technique, you add a symbol to the bar to denote that it is “broken” and actually extends further than is shown in the graph. Here’s an example of implementing this technique to the incarceration rates graph just shown.
This approach distorts the data and is an arbitrary decision as to where you cut the bars and how far out you extend the horizontal axis. Such decisions should really be avoided in data visualization.
Joshua Stevens, who leads the data visualization and cartography efforts at NASA , also used the “break-the-frame” approach in this tongue-in-cheek tweet:
And Auke Hoekstra, Program Director at Neon Research, a multidisciplinary research program focusing on climate issues and based in the Netherlands, also used the approach in his graph that shows how the International Energy Agency has consistently underestimated the growth in gigawatt (GW) production from solar panels. There’s no outline/frame in this graph, but the fact that the original has a gray background, and his drawn line extends outside that space has the same effect.
Wrap Up
I’m adding this “break-the-frame” approach to my data visualization toolbox as an effective way to show (and emphasize) outliers in my data. In cases where I still want to make it easier for my reader to see variation in other values, I might use a second graph, but I think this technique is a great way to emphasize these large values. It’s certainly better than the “break-the-bar” approach, which distorts and misrepresents the data.
Episode #224: Pieta Blakely and Eli Holder
Pieta Blakely, PhD helps mission-based organizations measure their impact so that they can do what they do well. Eli Holder is a dataviz designer, researcher, and founder of 3iap, a data visualization design firm. In this week's episode of the show, we talk about Pieta's and Eli's recent work on racial equity and deficit thinking in data visualization.
Eli will present his
Episode #223: Cole Nussbaumer Knaflic
Quick re-up from last week's special episode of the podcast with Cole Nussbaumer Knaflic. Cole is SWD CEO and author of the brand new book storytelling with you: plan, create, and deliver a stellar presentation and best-selling books storytelling with data: let’s practice! and storytelling with data: a data visualization guide for business professionals, which has been translated into a dozen languages, used as a textbook by more than 100 universities and serves as the course book for tens of thousands of SWD workshop participants. For more than a decade, Cole and her team have delivered interactive learning sessions sought after by data-minded individuals, companies, and philanthropic organizations all over the world. They also help people create graphs that make sense and weave them into compelling stories through the popular SWD community, blog, podcast and videos.
What I'm Reading & Watching
Books
Station 11 (the show on HBOMax was amazing, so I wanted to read the book too)
Tidy Modeling with R by Max Kuhn and Julia Silge (both coming up on the podcast)
White Rage: The Unspoken Truth of Our Racial Divide by Carol Anderson
Articles
TV/Movies
Baseball playoffs, hockey starts
Abbott Elementary
Welcome to Wrexham
Note: As an Amazon Associate I earn from qualifying purchases.
Join my Winno Community!
If you want to get some short, actionable dataviz advice, check out my new Winno community. I send about 2-3 text messages each week with some little pointers about dataviz. There is now a free tier! You get a fewer texts and giveaways, but it's a good way to test it out. If you like what you see, sign up for only $5/month. Your subscription helps support this newsletter and the podcast. I hope to see you there!