Readings on "Bad Graphs"


Misleading Axes on Graphs

The points made in this article are already well known to me. I can appreciate the section on multiple axes on one graph. I often feel those types of graphs are confusing and often suggest causal relationships that may not exist. However, I feel another issue here is that in most of the charts presented there is a lack of accountability; namely the author of the chart rarely chooses to include their name in the work. I think this simple act may go a long way to cause graph authors to re-think if their work is truthful or misleading.

The gun deaths chart is probably the most provocative example in the chart. My belief is that respecting the cultural norms of the audience is important; flipping an axis from the common convention is only going to mislead despite whatever intentions the author has. Thus I would place a higher value on legibility and interpretability. It is not clear to me what value the author was drawing from, other than novelty.

Proportional Ink

This article covers some fairly well established territory again, and I agree with just about everything stated. I think the most interesting of the examples is the Time Magazine “causes of death.” The author makes a great point that comparisons can be made which the graph author likely did not consider (the massive ink space devoted to toddlers accidents vs. senior accidents). Also interesting is the author making the case for a some legitimate uses of 3D, which I had not previously given much thought to.

Chapter 2 “Look at Data”

I thought this book chapter served as a great overview to many common topics in data visualization. Already familiar with Ancombe’s Quartet, I found the Jan van Hove small multiple of “same correlation” scatterplots really interesting. I’m not sure if this was attempted before, but what amazed me is that was produced in 2016. It seems so clear & useful, and a great reminder of how visualization can really assist in discovering interesting patterns that may be missed if usingly only the most common quantitative statistical measures.

The Tufte/Holmes “debate” always struck me as ridiculous. To me, it’s always been a question of competing values, not of “proper chart construction”. The Tufte box plot is a good reminder of how minimalism doesn’t always communicate more efficiently. The “violin” plot always struck me as superior – if the desired value is to communicates more information in a smaller place, that is, the very essence of efficiency. But the violin plot wasn’t technically feasible in 1980s and prior (box plot of course being a Tukey invention).

I think the NY Times “Essential to live in a Democracy” plot is really important example. I think this is a tremendous inherent problem in many of these survey scales. In the business world likert scales and NPS scales are some of the most poorly represented data. I am wondering if there is a word for this phenomenon – when someone answers 6 out of 10 on a survey and another person answers 3 out of 10 it doesn’t necessarily mean the the person who answered 6 feels 2x as strongly as the person who answered a 3.

The section describing the work of Cleveland/McGill and later Heer/Bostock is interesting. I don’t remember seeing this level of quantification of decoding before. I would be interested in understanding how this is changing over time. I see more scatterplots than ever in the New York Times and Wall St Journal. Is this because the readers are now familiar with the chart type, or is it simply that these publications cater to a higher educated audience?

Finally I’d like to comment on the discussion on axis that fail to include zero on line graphs. As the prior article noted, and I agree, sometimes there are very good reasons to not include a 0 baseline. The argument that “graphs that don’t go to zero are a thought crime” fails to distinguish between types of graphs. I believe it is an error to automatically assume malice or bias in situations like this.