Quantcast
Channel: Active questions tagged r - Stack Overflow
Viewing all articles
Browse latest Browse all 201894

Customize legend labels for continuous variables cut into several categories in ggplot2

$
0
0

I am drawing a choropleth map for US counties. I randomly created a variable dumb that is uniformly distributed between 1 and 100 and cut it into 6 categories, which is stored as dumb_quantiles. I then mapped dumb_quantiles categories to counties of contiguous US. The code and resultant graph are below:

library(sf)
library(tidyverse)
library(RColorBrewer) #for some nice color palettes

# US map downloaded from https://gadm.org/download_country_v3.html

# National border
us0 <- st_read("<Path>\\gadm36_USA_0.shp")
# State border
us1 <- st_read("<Path>\\gadm36_USA_1.shp")
# County border
us2 <- st_read("<Path>\\gadm36_USA_2.shp")

########################### Remove the Great Lakes #############################
# See my post https://stackoverflow.com/questions/59113457/removing-the-great-lakes-from-us-county-level-maps-in-r
# retrieving the name of lakes and excluding them from the sf 
all.names = us2$NAME_2
patterns = c("Lake", "lake")

lakes.name <- unique(grep(paste(patterns, collapse="|"), 
                     all.names, 
                     value=TRUE, ignore.case = TRUE))
#[1] "Lake and Peninsula""Lake""Bear Lake""Lake Michigan""Lake Hurron""Lake St. Clair"    
#[7] "Lake Superior""Lake of the Woods""Red Lake""Lake Ontario""Lake Erie""Salt Lake"         
#[13] "Green Lake" 

# Pick the Great Lakes and exclude from us2
lakes.name <- lakes.name[c(4, 5, 7, 10, 11)]
`%notin%` <- Negate(`%in%`)
us2 <- us2[us2$NAME_2 %notin% lakes.name, ]
 ######################### Remove the Great Lakes (end)##########################

# Create a continuous variable
us2$dumb <- runif(nrow(us2), 1,100)

# Create labels
# define number of classes
no_classes <- 6

# extract quantiles
quantiles <- us2 %>%
             pull(dumb) %>%
             quantile(probs = seq(0, 1, length.out = no_classes + 1)) %>%
             as.vector() # to remove names of quantiles, so idx below is numeric

# here we create custom labels
labels <- imap_chr(quantiles, function(., idx){
  return(paste0(round(quantiles[idx], 0),
                "–",
                round(quantiles[idx + 1] , 0)
                ))
})

# we need to remove the last label 
# because that would be something like "*** - NA"
labels <- labels[1:length(labels) - 1]

# Here we actually create a new 
# variable on the dataset with the quantiles
us2 <- us2 %>%
       mutate(dumb_quantiles = cut(dumb,
              breaks = quantiles,
              labels = labels,
              include.lowest = T))

# Color palette
pal <- brewer.pal(length(labels), "RdBu")

# Set default theme
theme_map <- function(...) {
    theme_minimal() +
        theme(
            # remove all axes
            axis.line = element_blank(),
            axis.text.x = element_blank(),
            axis.text.y = element_blank(),
            axis.ticks = element_blank(),
            # add a subtle grid
            panel.grid.major = element_blank(),
            panel.grid.minor = element_blank(),
            legend.justification = c(1, 1),  # top-right of the legend as the
                                             # anchor point
            legend.position = c(1, 0.1) # Place top-right of the legend to
                                        # 0.1 unit above lower-right corner of
                                        # the image
            )
}

# County level
mainland2 <- ggplot(data = us2) +
    geom_sf(aes(fill = dumb_quantiles), size = 0.1, color = "black") +
    coord_sf(crs = st_crs(2163), 
             xlim = c(-2500000, 2500000), 
             ylim = c(-2300000, 730000)) +
    theme_map()

# Final plot across three levels
p <- mainland2 +
    # US state level boundary
    geom_sf(data = us1, fill = NA, size = 0.3, color = "black") +
    coord_sf(crs = st_crs(2163), 
             xlim = c(-2500000, 2500000), 
             ylim = c(-2300000, 730000)) +
    # US national level boundary
    geom_sf(data = us0, fill = NA, size = 0.3, color = "black") +
    coord_sf(crs = st_crs(2163), 
             xlim = c(-2500000, 2500000), 
             ylim = c(-2300000, 730000)) +
    scale_fill_manual(
        values = rev(pal),
        breaks = labels,
        name = "Title here",
        drop = FALSE,
        labels = labels,
        guide = guide_legend(
                direction = "horizontal",
                keyheight = unit(2, units = "mm"),
                keywidth = unit(10 / (length(labels)/2), units = "mm"),
                title.position = "top",
                nrow = 2,
                byrow = T,
                reverse = T # display highest income on top
                #label.position = "bottom"
    )) +
    theme_map() 

The resultant graph is as follows: enter image description here

I want some modifications of the above map.

  1. Counties with dumb< 10 are grey colored and only counties with dumb values >= 10 are color coded according to their dumb_quantile category, using the "RdBu" palette as used above;

  2. In the legend, label grey colored areas as "<10 deaths" and label other dumb_quantiles categories in similar ways as shown above.

An illustration of the desired legend is shown below: enter image description here

Any idea on how the two modifications could be achieved? Thank you.


Viewing all articles
Browse latest Browse all 201894

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>