Viewing all articles
Browse latest Browse all 205466

geom_boxplot gave wrong whiskers

I am making a boxplot using geom_boxplot in ggplot2. However, I found the whiskers length is not correct and I don't know why. Here is my data:

value = c(1.3739117,0.8709891,3.4510461,0.8470309,1.4838725,0.6942611,1.3095816,3.0444649,19.2785424,1.0866242,0.9376845,2.2343836, 20.7975509, 20.3102489, 18.0046679,1.4197519)
data = data.frame(value)
ggplot(data, aes(y = value)) +
   stat_boxplot(geom = "errorbar", width = 0.3) +
   geom_boxplot(width = 0.5)

And I see the plot like this:

Image may be NSFW.
Clik here to view.
enter image description here

The 3rd quantile is overlapped with the upper whisker. I did the calculation manually, and the result is as following:

Min.   : 0.6943  
1st Qu.: 1.0494  
Median : 1.4518  
Mean   : 6.0715  
3rd Qu.: 7.0895  
Max.   :20.7976

Based on the explanation of geom_boxplot: The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge (where IQR is the inter-quartile range, or distance between the first and third quartiles). The lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge.

The IQR in my case is: 7.0895-1.0494 = 6.0401

The lower whisker should be: 0.6943 - 1.5*6.0401 = -8.36585

The upper whisker should be: 7.0895 + 1.5*6.0401 = 16.14965

I understand the negative lower whisker is meaningless, so here it is replaced by the min value. But why the upper whisker is not shown? I am so confused and I could not find an example online to solve this problem. Something I misunderstand about ggplot settings? I would really appreciate to your help and suggestions!

Viewing all articles
Browse latest Browse all 205466

Trending Articles