News and Notes from PolicyViz - Issue #2
I was a little vague in my initial tweet, so let me tell you what I was working on. I'm currently writing an article about data visualization for an academic journal. The editors want it to be a high-level discussion of practical tips on data visualization.
I thought data related to COVID would be a good candidate for this project--there's lots of it available, generally easy to get, and the basic stories are well known. So, I made some state-level maps, a heatmap or two, a couple of scatterplots, and some histograms.
I also wanted to make a graph that shows a part-to-whole relationship. My instinct was to create a treemap or Voronoi diagram of the number of people vaccinated in each state. The problem is that the number of people who are vaccinated isn't necessarily correlated with the share of people who are vaccinated, which is really what is important. For example, as of late August, 13.8 million people in Texas were vaccinated, which corresponds to 47% of that state's population. In Rhode Island, about 685,000 people are vaccinated, which is about 65% of the population.
See the problem? If I showed you a pie chart of the number of vaccinated people, Texas would look like it's doing great. If I showed you a pie chart of the share of vaccinated people, the chart doesn't really make sense because 47% + 65% = 112%, which is greater than 100%.
Can I scale these shares to make them part-to-whole? I could divide the vaccination rate in each state by the sum up the two percentages. In other words, for Texas I'd get 47%/112% = 31.6% and for Rhode Island I'd get 65%/112% = 43.1%. Now, Rhode Island looks better compared with Texas because the share of its vaccinated population is greater.
But what do we call these percentages now? Does "Participant Contribution by Group" work with a note that says "weighted percentages"? Maybe a subtitle or note would need to explain this in more detail.
I could, of course, us a different chart type altogether. Or, as some suggested on Twitter, just plot all four possible groups--vaccinated/not vaccinated in both states. I'm not sure this really works--again, Rhode Island is tiny compared with Texas--plus, if I wanted to show data for all 50 states, this would be impossible.
Where do I land on all of this? I think the weighted percentages holds promise. I'm not sure on the language and how to describe those shares, but it gives a clearer, more accurate presentation of the data, so I think there's promise here.
I may or may not write this up as a longer blog post on PolicyViz, but this is the kind of thing you can expect by subscribing to this newsletter. Issues come out every-other-week right before the podcast, so stay tuned, and read more below!
Thanks,
Jon
What I'm Reading
A few things I'm reading (or should be reading!) and think would be good additions to your reading lists.
Books
Blog Posts, Tweets, Podcasts, and Conferences
Interpreting the Do No Harm guide through a Canadian context for Truth and Reconciliation
Asia’s data scene deserves greater attention. That’s why we are starting a movement
Tweet: Tufte being an a**hole
Enable Keyboard Navigation and Selection of Visualization Marks
Nirvana’s ‘Nevermind’ at 30: The Inside Story of the Album’s ‘Overnight’ Success
Tweet: A question I have on plotting scaled part-to-whole relationships
Research and Papers
What the literature tells us about decolonizing impact measurement
Diversity, Equity, and Inclusion in Health Services and Policy Research
New, fun dataviz-inspired shirts and more! Plus, Graphic Continuum sheets, posters, and more. As a subscriber, you can get an additional 10% off everything in the shop! Just use the coupon code "subscriber" when you check out.
I've been expanding my Instagram use over the past couple of weeks to show graphs, charts, and diagrams from my Data Visualization Catalog. Hopefully, you can use my Instagram feed as a source of inspiration for your own work.