Creating a color palette from an image

There are hundreds of color palettes in the R ecosystem, but sometimes we might want to use colors from a specific image. Here I show how to use the paletter package to create a color palette for the 2020 Eucalypt of the Year: the Western Australian Gimlet.

Summaries
Eukaryota
Plantae
R
Author

Martin Westgate

Published

January 3, 2021

Author

Martin Westgate

Date

March 2021

Colors in R

Color palettes are important to people, and the R ecosystem includes literally hundreds of possible palettes. If you want a “complete” list, go and check out Emil Hvitfeldt’s list of palettes here; but in practice there are only a few that we use routinely. Our default at ALA labs is to use viridis for continuous scales, because (to quote their CRAN page) it’s color-blind friendly, perceptually uniform, and pretty. The default purple-green-yellow color scheme is lovely, but I’m a big fan of ‘magma’, which has a black-purple-orange-yellow scheme

library(galah)
library(dplyr)
library(ggplot2)
library(viridis)
# Get field code for states/territories
search_fields("state") # layer: cl22 OR stateProvince
# A tibble: 14 × 3
   id                    description                                      type  
   <chr>                 <chr>                                            <chr> 
 1 cl22                  Australian States and Territories                fields
 2 cl927                 States including coastal waters                  fields
 3 cl938                 Fruit Fly Exclusion Zone - Tri State             fields
 4 cl2013                ASGS Australian States and Territories           fields
 5 cl10900               Australia's Indigenous forest estate (2013) v2.0 fields
 6 cl10922               PSMA State Electoral Boundaries (2018)           fields
 7 cl10925               PSMA States (2016)                               fields
 8 cl110922              PSMA State Electoral Boundary Classes (2018)     fields
 9 cl110925              PSMA States - Abbreviated (2016)                 fields
10 stateInvasive         <NA>                                             fields
11 stateProvince         State/Territory                                  fields
12 raw_stateProvince     State/Territory (unprocessed)                    fields
13 stateConservation     State conservation                               fields
14 raw_stateConservation State conservation (unprocessed)                 fields
# Download record counts by state/territory
records <- galah_call() %>%
  galah_group_by(cl22) %>%
  atlas_counts()

# Add state information back to data frame
records$State <- factor(seq_len(nrow(records)), labels = records$cl22) 

# Plot
ggplot(records, aes(x = State, y = log10(count), fill = count)) + 
  geom_bar(stat = "identity") +
  coord_flip() +
  scale_fill_viridis(option = "magma", begin = 0.10, end = 0.95) +
  theme_bw() +
  theme(legend.position = "none")

My default for categorical color schemes is the ‘dark2’ palette from RColorBrewer; but given the subject matter of our work, it’s worth mentioning the wonderful feather package by Shandiya Balasubramaniam, which gives colors based on Australian bird plumage.

# remotes::install_github(repo = "shandiya/feathers")
library(feathers)

rcfd <- galah_call() %>%
  galah_identify("Rose-crowned Fruit-Dove") %>%
  galah_group_by(cl22) %>%
  atlas_counts()
  
rcfd$State <- factor(seq_len(nrow(rcfd)), labels = rcfd$cl22) 

ggplot(rcfd, aes(x = State, y = log10(count), fill = State)) + 
  geom_bar(stat = "identity") +
  coord_flip() +
  scale_fill_manual(values = get_pal("rose_crowned_fruit_dove")) +
  theme_bw() +
  theme(legend.position = "none")

All of this is fine, but what if you have a specific image that you want to take colors from? A logical choice is to pick the colors you want using an image editting program, but if we want to try something automated, there are options in R as well.

Extracting colors

National Eucalypt Day aims to raise awareness about Eucalypts and celebrate their influence on the lives of Australians. In honour of National Eucalypt day, we wanted to created a plot based on occurrences data held in the Atlas of Living Australia, themed using colours from actual Eucalypts.

We used this image from a tweet by Dean Nicolle:

Image of Eucalyptus salubris by Dean Nicolle

First, get observations of the Eucalypt of the Year 2021 from ALA using the galah package. Specifically, we use atlas_counts() to determine how many records of Eucalyptus salubris are held by the ALA:

n_records <- galah_call() %>%
  galah_identify("Eucalyptus salubris") %>%
  atlas_counts()

Here is what the data look like:

n_records %>% head()
# A tibble: 1 × 1
  count
  <int>
1   892

Then get a color scheme from images of the species in question using the paletter package (which needs to be installed from GitHub)

# remotes::install_github("AndreaCirilloAC/paletter")
library(paletter)

image_pal <- create_palette(
  image_path = "./data/Dean_Nicolle_Esalubris_image_small.jpeg",
  type_of_variable = "categorical",
  number_of_colors = 15)

Note that we downsized the image before running the paletter code, as large images take much longer to process.

Creating a plot

Once we have this palette, the obvious question is what kind of plot to draw. We could have done a map, but those can be a bit boring. We decided to try something that represented the number of observations we had of this species at ALA, and included color, but was otherwise just a pretty picture that didn’t need to contain any further information. Rather than have a traditional x and y axis, therefore, we decided to try out the igraph package to plot the points in an interesting way.

First, we create a vector containing as many points as we want to display, and distribute our colors among them as evenly as possible

# create a vector to index colours
rep_times <- floor(n_records / length(image_pal))

colour_index <- rep(seq_along(image_pal),
  each = as.integer(rep_times))

Then we can create a network using igraph, and use it to create a layout for our points

library(igraph)

graph_list <- lapply(c(1:15), function(a){
  lookup <- which(colour_index == a)
  return(
    tibble(
    from = lookup[c(1:(length(lookup)-1))],
    to = lookup[c(2:length(lookup))])
    )
  })
graph_df <- as_tibble(do.call(rbind, graph_list)) %>%     # build matrix
  tidyr::drop_na() %>%
  as.matrix(.)
colour_graph <- graph_from_edgelist(graph_df)             # create network graph

# convert to a set of point locations
test_layout <- as.data.frame(layout_nicely(colour_graph)) # convert to df
colnames(test_layout) <- c("x", "y")                      # change colnames
test_layout$colour_index <- factor(colour_index)          # add colour_index col

Finally, we draw the plot with ggplot2, removing axes with theme_void()

ggplot(test_layout, aes(x = x, y = y, colour = colour_index)) +
  geom_point(size = 3, alpha = 0.9) +
  scale_color_manual(values = image_pal) +
  coord_fixed() +
  theme_void() +
  theme(legend.position = "none")

That’s it! While I like the effect here, I think the paletter package is best suited to cases where there are large areas of strongly contrasting colors; it’s less ideal for images with subtle color differences. It also doesn’t appear to have been updated lately, which may mean it’s not being supported any more. But I’m happy with this plot, and would definitely consider using it again.

Expand for session info
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_Australia.utf8
 ctype    English_Australia.utf8
 tz       Australia/Sydney
 date     2024-02-12
 pandoc   3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version    date (UTC) lib source
 dplyr       * 1.1.4      2023-11-17 [1] CRAN (R 4.3.2)
 feathers    * 0.0.0.9000 2022-10-11 [1] Github (shandiya/feathers@4be766d)
 galah       * 2.0.1      2024-02-06 [1] CRAN (R 4.3.2)
 ggplot2     * 3.4.4      2023-10-12 [1] CRAN (R 4.3.1)
 htmltools   * 0.5.7      2023-11-03 [1] CRAN (R 4.3.2)
 igraph      * 1.5.1      2023-08-10 [1] CRAN (R 4.3.2)
 paletter    * 0.0.0.9000 2023-01-10 [1] Github (AndreaCirilloAC/paletter@c09605b)
 sessioninfo * 1.2.2      2021-12-06 [1] CRAN (R 4.3.2)
 viridis     * 0.6.4      2023-07-22 [1] CRAN (R 4.3.2)
 viridisLite * 0.4.2      2023-05-02 [1] CRAN (R 4.3.1)

 [1] C:/Users/KEL329/R-packages
 [2] C:/Users/KEL329/AppData/Local/Programs/R/R-4.3.2/library

──────────────────────────────────────────────────────────────────────────────