
---
> If you’re a human with normal vision, you intuitively perceive thick, wide shapes as representing greater quantities (of whatever) than thin, narrow shapes. The ‘box’ segments in each box-and-whisker shape, then, look like they represent *more of something* than the ‘whisker’ segments, leading to the faulty interpretation that they contain more values, or perhaps have more importance. All four box and whisker segments contain the *same* number of values
- [View Highlight](https://read.readwise.io/read/01j12p1kwzsxzmxnxr0srykmh6)
---
> In a box plot, however, longer box or whisker segments *don’t* represent greater quantities. The four segments in a box plot each represent the *same* quantity, i.e., they each contain the same number of values, regardless of how long or short they are. In fact, shorter segments in box plots actually represent *higher densities* of values
- [View Highlight](https://read.readwise.io/read/01j12p4jarz7aq38qk27p0tq7z)
---
> “But *I* don’t find box plots hard to understand.”
> I hear you. *I* don’t find them hard to understand, either. That’s because you and I have been looking at box plots for years and our brains have learned to “think around” their design flaws. It *doesn’t* mean that they’re a well-designed chart type.
- [View Highlight](https://read.readwise.io/read/01j12p7azf0rqybhjm5epvs6qg)
---
> Have a look at the box plot on the left, then compare it to the jittered strip plot *of the same data* on the right:
> 
> I’m [not the first](https://www.researchgate.net/publication/316652618_Same_Stats_Different_Graphs_Generating_Datasets_with_Varied_Appearance_and_Identical_Statistics_through_Simulated_Annealing) to point out that box plots *always* make distributions look ‘bell shaped,’ i.e., like the values are clustered around the median and gradually trail off away from the median. If a set of values *isn’t* bell shaped (like the “Control group” above), though, box plots *make them look bell shaped anyway*.
- _Tags_: `favorite`
---
> The box plot alternative that I use most often is a *strip plot*:
> 
> Strip plots can be grasped by most audiences in *a few seconds* and can be explained with a single sentence, such as, “Each dot is the age of a study participant.”
- [View Highlight](https://read.readwise.io/read/01j12r42e68txxv6m7z1ndb89s)
---
> you could move to what I call a *distribution heatmap*, which can handle any number of values:
> 
> Yes, a distribution heatmap ups the complexity a bit since it introduces the concept of *bins* (a.k.a. *intervals*), but bins are still considerably easier to grasp than quartiles. You also lose the ability to see how many values are in each group and there are a few other limitations, but box plots have all those same limitations (plus several others).
- [View Highlight](https://read.readwise.io/read/01j12r8kktfkygehbwd54kjjej)
---