Combining multiple species distributions on one map with hexagons and points

Visualising multiple species distributions in a single figure can be difficult if there are areas where ranges overlap. In this post we introduce a way to show several species distributions in an area at once using a novel twist on the commonly used hexbin map.

Eukaryota
Animalia
Chordata
Aves
Maps
R
Authors

Callum Waite

Shandiya Balasubramaniam

Published

January 25, 2024

Author

Callum Waite
Shandiya Balasubramaniam

Date

25 January 2024

Visualisations of species distributions can be very simple yet effective ways of conveying biological and ecological information, such as range, habitat, and relative population size.

Representing more than one species distribution in a single figure can be difficult, though, especially where there are areas of overlap. Points and colour-filled polygons will obscure each other even with a degree of transparency, while densities and shaded regions can only show one species at a time.

Here, we demonstrate a method to visualise distributions of multiple species with overlapping ranges on the same map, with only a small loss in resolution. The technique is a novel twist on the commonly used hexbin map: instead of using a colour fill to represent presence/absence or counts within each hexagon, we use multiple coloured points within each hexagon to represent presence/absence of species, allowing users to get a broad overview of how multiple species are distributed across an area.

This method requires a number of steps to build up the elements of the final figure:

The final figure will comprise a combination of the basemap, hex grid, and species points once these elements are created.

Let’s begin by loading the R packages we’ll be using.

library(galah)
library(ggtext)
library(ozmaps)
library(sf)
library(showtext)
library(stringr)
library(tidyverse)

We’ll use the {galah} package to download occurrence records from the Atlas of Living Australia (ALA). To do this, you’ll need to register your email address with the ALA, then pass it to {galah} using galah_config().

galah_config(email = "your-email@email.com")

Download data

Since our goal here is to map distributions of multiple species, we’ve chosen honeyeaters from the genus Melithreptus: this is a distinctive group of 7 small- to medium-sized, short-billed and square-tailed honeyeaters with overlapping distributions across Australia.

We can get taxonomic information about this group using atlas_species()

melithreptus <- galah_call() |>
  galah_identify("Melithreptus") |>
  atlas_species()

melithreptus
# A tibble: 7 × 10
  species_guid            species author kingdom phylum class order family genus
  <chr>                   <chr>   <chr>  <chr>   <chr>  <chr> <chr> <chr>  <chr>
1 https://biodiversity.o… Melith… (Viei… Animal… Chord… Aves  Pass… Melip… Meli…
2 https://biodiversity.o… Melith… (Vigo… Animal… Chord… Aves  Pass… Melip… Meli…
3 https://biodiversity.o… Melith… Gould… Animal… Chord… Aves  Pass… Melip… Meli…
4 https://biodiversity.o… Melith… (Goul… Animal… Chord… Aves  Pass… Melip… Meli…
5 https://biodiversity.o… Melith… (Less… Animal… Chord… Aves  Pass… Melip… Meli…
6 https://biodiversity.o… Melith… (Goul… Animal… Chord… Aves  Pass… Melip… Meli…
7 https://biodiversity.o… Melith… Gould… Animal… Chord… Aves  Pass… Melip… Meli…
# ℹ 1 more variable: vernacular_name <chr>

… and then use this information to download occurrence records for the 7 species. We’ll apply a general set of ALA data quality filters to remove low quality records with galah_apply_profile(), and pass in the list of species we’re interested in with galah_identify(). We’ll also filter records to 20221, choosing only those with spatial coordinates and that fall within one of the IBRA bioregions (as a proxy for Australian records only).

species_occ <- galah_call() |>
  galah_apply_profile(ALA) |>
  galah_identify(melithreptus$species) |>
  galah_filter(year == 2022,
               !is.na(cl1048),  # IBRA bioregions
               !is.na(decimalLatitude),
               !is.na(decimalLongitude)) |>
  galah_select(decimalLatitude,
               decimalLongitude,
               species, 
               scientificName) |> 
  atlas_occurrences()

head(species_occ)
# A tibble: 6 × 4
  decimalLatitude decimalLongitude species                    scientificName    
            <dbl>            <dbl> <chr>                      <chr>             
1           -43.6             147. Melithreptus validirostris Melithreptus (Eid…
2           -43.6             147. Melithreptus validirostris Melithreptus (Eid…
3           -43.5             146. Melithreptus validirostris Melithreptus (Eid…
4           -43.5             147. Melithreptus affinis       Melithreptus (Mel…
5           -43.5             147. Melithreptus affinis       Melithreptus (Mel…
6           -43.5             147. Melithreptus validirostris Melithreptus (Eid…

Since we’re going to be performing a few spatial operations to assign species to hexagons, let’s convert the species_occ dataframe into a simple features (sf) object, with latitude and longitude columns represented as points in a geometry column named occ_geometry.

species_occ_sf <- species_occ |>
  st_as_sf(coords = c("decimalLongitude", "decimalLatitude"), 
           crs = 4326) |> 
  st_set_geometry("occ_geometry")

head(species_occ_sf)
Simple feature collection with 6 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 146.1362 ymin: -43.59761 xmax: 147.1435 ymax: -43.48602
Geodetic CRS:  WGS 84
# A tibble: 6 × 3
  species                    scientificName                    occ_geometry
  <chr>                      <chr>                              <POINT [°]>
1 Melithreptus validirostris Melithreptus (Eidopsarus… (146.8627 -43.59761)
2 Melithreptus validirostris Melithreptus (Eidopsarus… (146.8627 -43.59761)
3 Melithreptus validirostris Melithreptus (Eidopsarus… (146.1362 -43.50299)
4 Melithreptus affinis       Melithreptus (Melithrept… (147.1435 -43.49038)
5 Melithreptus affinis       Melithreptus (Melithrept… (146.9424 -43.48602)
6 Melithreptus validirostris Melithreptus (Eidopsarus… (146.9424 -43.48602)

Generate hex grid

Next, we’ll set up a grid of hexagons across Australia, which we’ll use as bins for plotting summaries of species occurrence.

st_make_grid() can make a grid that covers the bounding box of the supplied shapefile (here the ozmap_country shapefile), with arguments for specifying the size, type, and orientation of polygons in a grid. We’ll transform the projection to match the coordinate reference system we set for the species occurrence records above (4326), and assign a unique identifier to each hexagon in a column named hex_id.

hex_grid <- st_make_grid(ozmap_country,
                         cellsize = 2,
                         what = "polygons",
                         square = FALSE,
                         flat_topped = TRUE) |> 
  st_as_sf() |> 
  st_set_geometry("hex_geometry") |> 
  st_transform(4326) |> 
  rowid_to_column(var = "hex_id")
hex_grid
Simple feature collection with 703 features and 1 field
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 104.396 ymin: -45.63203 xmax: 170.7912 ymax: -7.632027
Geodetic CRS:  WGS 84
First 10 features:
   hex_id                   hex_geometry
1       1 POLYGON ((106.128 -44.63203...
2       2 POLYGON ((109.5921 -44.6320...
3       3 POLYGON ((113.0562 -44.6320...
4       4 POLYGON ((116.5203 -44.6320...
5       5 POLYGON ((119.9844 -44.6320...
6       6 POLYGON ((123.4485 -44.6320...
7       7 POLYGON ((126.9126 -44.6320...
8       8 POLYGON ((130.3767 -44.6320...
9       9 POLYGON ((133.8408 -44.6320...
10     10 POLYGON ((137.3049 -44.6320...

Our grid of hexagons looks like this:

Code
ggplot() +
  geom_sf(data = ozmap_states, 
          colour = "darkgrey", 
          fill = NA, 
          linewidth = 0.3) +
  geom_sf(data = hex_grid, 
          fill = NA, 
          col = "deepskyblue4", 
          linewidth = 0.2) +
  theme_void()

Remove empty hexes

You’ve probably noticed there are a lot of redundant hexagons in the grid we just created. Not every terrestrial hexagon will contain an occurrence record, and we can confidently assume hexagons in the ocean will not contain records of honeyeaters.

We’ll remove these empty hexagons with a spatial join (which behaves similarly to dplyr::left_join() for spatial objects). This returns a dataframe that has all the information from our original occurrence download, where each row is a record of a species in a particular location, but each record’s point location has now been matched to a hexagon from the grid we just created.

hex_with_species <- st_join(x = hex_grid, 
                            y = species_occ_sf,
                            join = st_intersects,
                            left = FALSE)

head(hex_with_species, n = 10)
Simple feature collection with 10 features and 3 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 145.9652 ymin: -44.63203 xmax: 148.2746 ymax: -42.63203
Geodetic CRS:  WGS 84
     hex_id                    species                          scientificName
32       32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.1     32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.2     32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.3     32       Melithreptus affinis     Melithreptus (Melithreptus) affinis
32.4     32       Melithreptus affinis     Melithreptus (Melithreptus) affinis
32.5     32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.6     32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.7     32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.8     32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.9     32       Melithreptus affinis     Melithreptus (Melithreptus) affinis
                       hex_geometry
32   POLYGON ((145.9652 -43.6320...
32.1 POLYGON ((145.9652 -43.6320...
32.2 POLYGON ((145.9652 -43.6320...
32.3 POLYGON ((145.9652 -43.6320...
32.4 POLYGON ((145.9652 -43.6320...
32.5 POLYGON ((145.9652 -43.6320...
32.6 POLYGON ((145.9652 -43.6320...
32.7 POLYGON ((145.9652 -43.6320...
32.8 POLYGON ((145.9652 -43.6320...
32.9 POLYGON ((145.9652 -43.6320...

This means any hexagons we initially created in the grid that don’t intersect with occurrence records have been removed:

Code
ggplot() +
  geom_sf(data = ozmap_states, 
          colour = "darkgrey", 
          fill = NA, 
          linewidth = 0.3) +
  geom_sf(data = count(hex_with_species, hex_id, hex_geometry), 
          fill = NA, 
          col = "deepskyblue4", 
          linewidth = 0.4) +
  theme_void()

Visualising multiple species in a hexagon

As some hexagons will contain occurrence records for more than one species, we need a way to display these overlaps. We’ll do this by setting up 7 positions in each hexagon, 1 for each species, and assign each species a position and colour so they can be visually differentiated.

The figure below summarises the process we’ll follow: for each hexagon remaining in the grid, we’ll generate a smaller hexagon, then get the coordinates of each vertex and centroid of the smaller hexagon. This gives us 7 positions to display up to 7 species in each hexagon.

Set up 7 points

Let’s start by extracting the unique identifiers and spatial coordinates for every hexagon containing an occurrence record2. Each hex_id refers to one of the remaining hexagons in our grid. This is step 1 from the figure above.

unique_hex <- hex_with_species |> 
  count(hex_id, hex_geometry) |> 
  select(-`n`)
unique_hex
Simple feature collection with 157 features and 1 field
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 113.0562 ymin: -44.63203 xmax: 155.2028 ymax: -9.632027
Geodetic CRS:  WGS 84
First 10 features:
   hex_id                   hex_geometry
1      32 POLYGON ((145.9652 -43.6320...
2      50 POLYGON ((144.2331 -42.6320...
3      51 POLYGON ((147.6972 -42.6320...
4      70 POLYGON ((145.9652 -41.6320...
5      88 POLYGON ((144.2331 -40.6320...
6      89 POLYGON ((147.6972 -40.6320...
7     107 POLYGON ((142.5011 -39.6320...
8     108 POLYGON ((145.9652 -39.6320...
9     125 POLYGON ((140.769 -38.63203...
10    126 POLYGON ((144.2331 -38.6320...

Next, we’ll work through steps 2 - 4. Let’s create a smaller hexagon within each original hex using st_buffer(), extract the coordinates of its vertices using st_coordinates(), and assign an integer to each vertex ranging from 1 to 73. We’ve created an anonymous function to pipe these steps together, and used pmap() to apply this function iteratively to every hexagon in the grid.

We use the dist argument in st_buffer() to define the size of the smaller hexagon, but this depends on the cellsize of the original larger hexagon (in the six-hexagon figure, cellsize = 2). Depending on the number of species you’d like to fit within each polygon and the shape of the polygon you’ve chosen, you may need to try out different values of cellsize and dist to find combinations that work best for your visualisation.

vertex_coords <- unique_hex |> 
  mutate(vertices = pmap(
    .l = list(x = hex_geometry),
    .f = function(x) {
      x |>
        st_buffer(dist = -0.4) |>         # STEP 2: set size of smaller hex
        st_coordinates() |>               # STEP 3: get vertex coordinates of smaller hex        
        as_tibble() |>                    # convert matrix to tibble  
        st_as_sf(coords = c("X", "Y")) |> # convert tibble to simple features
        select(-L1, -L2) |>               # remove unnecessary columns
        mutate(vertex_position = 1:7)     # STEP 4: number vertices 
    })) |> 
  unnest(cols = vertices)

head(vertex_coords, n = 10)
Simple feature collection with 10 features and 2 fields
Active geometry column: hex_geometry
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 144.2331 ymin: -44.63203 xmax: 148.2746 ymax: -41.63203
Geodetic CRS:  WGS 84
# A tibble: 10 × 4
   hex_id                 hex_geometry             geometry vertex_position
    <int>                <POLYGON [°]>              <POINT>           <int>
 1     32 ((145.9652 -43.63203, 146.5… (146.4271 -43.63203)               1
 2     32 ((145.9652 -43.63203, 146.5… (146.7735 -43.03203)               2
 3     32 ((145.9652 -43.63203, 146.5… (147.4663 -43.03203)               3
 4     32 ((145.9652 -43.63203, 146.5… (147.8127 -43.63203)               4
 5     32 ((145.9652 -43.63203, 146.5… (147.4663 -44.23203)               5
 6     32 ((145.9652 -43.63203, 146.5… (146.7735 -44.23203)               6
 7     32 ((145.9652 -43.63203, 146.5… (146.4271 -43.63203)               7
 8     50 ((144.2331 -42.63203, 144.8…  (144.695 -42.63203)               1
 9     50 ((144.2331 -42.63203, 144.8… (145.0414 -42.03203)               2
10     50 ((144.2331 -42.63203, 144.8… (145.7342 -42.03203)               3

In the resulting dataframe, the hex_id and hex_geometry columns contain the unique ID and geometry of the original large hexagons from the grid. Meanwhile, the remaining columns contain information for our newly created smaller hexagons: the geometry column contains the spatial coordinates of each hexagon’s respective corner vertex, and vertex_position identifies each vertex point.

We’d like to show information of 7 species in our hexagon, but despite having 7 points in vertex_position, the 7th point contains the same information as the 1st point. This is so that, when drawn by lines, the hexagon is closed. However, because we are only interested drawing points, we can mutate the duplicated row of the 7th vertex to hold the coordinates of the centroid of each hexagon instead. This will gives us seven distinct positions (step 5 in our six-hexagon figure).

vertex_centroid_coords <- vertex_coords |> 
  mutate(geometry = ifelse(vertex_position == 7,      
                           st_centroid(hex_geometry), 
                           geometry)) |> 
  st_drop_geometry() |>                               
  st_as_sf(crs = 4326)

head(vertex_centroid_coords, n = 10)
Simple feature collection with 10 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 144.695 ymin: -44.23203 xmax: 147.8127 ymax: -42.03203
Geodetic CRS:  WGS 84
# A tibble: 10 × 3
   hex_id             geometry vertex_position
    <int>          <POINT [°]>           <int>
 1     32 (146.4271 -43.63203)               1
 2     32 (146.7735 -43.03203)               2
 3     32 (147.4663 -43.03203)               3
 4     32 (147.8127 -43.63203)               4
 5     32 (147.4663 -44.23203)               5
 6     32 (146.7735 -44.23203)               6
 7     32 (147.1199 -43.63027)               7
 8     50  (144.695 -42.63203)               1
 9     50 (145.0414 -42.03203)               2
10     50 (145.7342 -42.03203)               3

Assign species to positions

The melithreptus dataframe (created earlier using atlas_species()) requires a small amount of tidying to be compatible with the rest of our data. The species column contains subgenera, which we can remove with regular expressions (regex)4. We also need to ensure all species have a vernacular name, noting that Melithreptus chloropsis is currently lacking one in the ALA database. We can then assign a number (1-7) denoting each species’ position in a hexagon5.

species_data <- melithreptus |>
  select(species, vernacular_name) |>
  mutate(species = str_replace_all(species, "\\(.*?\\)\\s*", ""),
         vernacular_name = if_else(species == "Melithreptus chloropsis", 
                                   "Gilbert's Honeyeater",
                                   vernacular_name),
         vertex_position = c(1:7))

species_data
# A tibble: 7 × 3
  species                    vernacular_name           vertex_position
  <chr>                      <chr>                               <int>
1 Melithreptus lunatus       White-naped Honeyeater                  1
2 Melithreptus brevirostris  Brown-headed Honeyeater                 2
3 Melithreptus albogularis   White-throated Honeyeater               3
4 Melithreptus gularis       Black-chinned Honeyeater                4
5 Melithreptus affinis       Black-headed Honeyeater                 5
6 Melithreptus validirostris Strong-billed Honeyeater                6
7 Melithreptus chloropsis    Gilbert's Honeyeater                    7

Our final step is to bring these three dataframes (hex_with_species, species_data, vertex_centroid_coords) together with dplyr::left_join().

We begin by joining our distinct hexagon and species combinations (hex_with_species) with our species positions and common names (species_data) using the species column…

species_points_a <- hex_with_species |>
  st_drop_geometry() |>
  select(hex_id, species) |>
  distinct() |> 
  left_join(species_data,
            by = join_by(species))

head(species_points_a, n = 10)
   hex_id                    species          vernacular_name vertex_position
1      32 Melithreptus validirostris Strong-billed Honeyeater               6
2      32       Melithreptus affinis  Black-headed Honeyeater               5
3      50 Melithreptus validirostris Strong-billed Honeyeater               6
4      50       Melithreptus affinis  Black-headed Honeyeater               5
5      51 Melithreptus validirostris Strong-billed Honeyeater               6
6      51       Melithreptus affinis  Black-headed Honeyeater               5
7      70       Melithreptus affinis  Black-headed Honeyeater               5
8      70 Melithreptus validirostris Strong-billed Honeyeater               6
9      88 Melithreptus validirostris Strong-billed Honeyeater               6
10     88       Melithreptus affinis  Black-headed Honeyeater               5

…and follow this with another join to get the point coordinates of each species’ point in each hexagon (vertex_centroid_coords), using the vertex_position and hex_id columns.

species_points <- species_points_a |>
  left_join(vertex_centroid_coords,
            by = join_by(vertex_position, hex_id)) |> 
  st_as_sf(crs = 4326)

head(species_points, n = 10)
Simple feature collection with 10 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 145.0414 ymin: -44.23203 xmax: 149.1983 ymax: -41.23203
Geodetic CRS:  WGS 84
   hex_id                    species          vernacular_name vertex_position
1      32 Melithreptus validirostris Strong-billed Honeyeater               6
2      32       Melithreptus affinis  Black-headed Honeyeater               5
3      50 Melithreptus validirostris Strong-billed Honeyeater               6
4      50       Melithreptus affinis  Black-headed Honeyeater               5
5      51 Melithreptus validirostris Strong-billed Honeyeater               6
6      51       Melithreptus affinis  Black-headed Honeyeater               5
7      70       Melithreptus affinis  Black-headed Honeyeater               5
8      70 Melithreptus validirostris Strong-billed Honeyeater               6
9      88 Melithreptus validirostris Strong-billed Honeyeater               6
10     88       Melithreptus affinis  Black-headed Honeyeater               5
                     geometry
1  POINT (146.7735 -44.23203)
2  POINT (147.4663 -44.23203)
3  POINT (145.0414 -43.23203)
4  POINT (145.7342 -43.23203)
5  POINT (148.5055 -43.23203)
6  POINT (149.1983 -43.23203)
7  POINT (147.4663 -42.23203)
8  POINT (146.7735 -42.23203)
9  POINT (145.0414 -41.23203)
10 POINT (145.7342 -41.23203)

Map

Let’s check how our three spatial layers—basemap, hexagons, and species points—look on a map.

ggplot() +
  geom_sf(data = ozmap_states, fill = NA) +
  geom_sf(data = unique_hex, fill = NA) +
  geom_sf(data = species_points, aes(colour = vernacular_name)) +
  lims(x = c(112, 155), y = c(-46, -8)) +
  theme_void()

This all looks correct! Now to add some final flourishes to make our map more aesthetically pleasing, as well as more accessible with a colourblind friendly palette by Paul Tol.

Code
font_add_google("Montserrat")
showtext_auto(enable = TRUE)

tol_muted <- c("#88CCEE", "#CC6677", "#DDCC77", "#117733", "#332288", "#AA4499", "#44AA99")

ggplot() +
  geom_sf(data = ozmap_states, 
          fill = NA, colour = "#ababab", linewidth = 0.3) +
  geom_sf(data = unique_hex, 
          fill = "#efefef55", colour = "#777777", linewidth = 0.5) +
  geom_sf(data = species_points, aes(colour = vernacular_name), 
          size = 2.3) +
  scale_colour_manual(
    values = tol_muted,
    guide = guide_legend(title = "*Melithreptus* &#0020; species",
                         override.aes = list(size = 4))
  ) +
  lims(x = c(112, 155), y = c(-46, -8)) +
  theme_void() +
  theme(legend.title = element_markdown(family = "Montserrat", size = 24),
        legend.text = element_text(family = "Montserrat", size = 20),
        legend.spacing.x = unit(0, "in"))

Final Thoughts

This visualisation is a novel way to show range overlaps and distributions of multiple species at once. A key strength is the consistency of the repeatable hex unit—the fixed positions and colours of the species points make it easy to follow patterns within or between species.

This is also a very flexible method. It’s easy to customise:

  1. the size, shape (hexagons vs squares) and orientation of the polygons
  2. the colours and orientations of points within the hexagons, and
  3. the spatial scale of the base map

Consider also that you do not necessarily need to use exactly seven different species/taxa—with a bit of creativity, it is possible to fit any number of points from 2-9 into a hexagon (2-7) or square symmetrically…

Expand for session info

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_Australia.utf8
 ctype    English_Australia.utf8
 tz       Australia/Sydney
 date     2024-02-12
 pandoc   3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 dplyr       * 1.1.4   2023-11-17 [1] CRAN (R 4.3.2)
 forcats     * 1.0.0   2023-01-29 [1] CRAN (R 4.3.2)
 galah       * 2.0.1   2024-02-06 [1] CRAN (R 4.3.2)
 ggplot2     * 3.4.4   2023-10-12 [1] CRAN (R 4.3.1)
 ggtext      * 0.1.2   2022-09-16 [1] CRAN (R 4.3.2)
 htmltools   * 0.5.7   2023-11-03 [1] CRAN (R 4.3.2)
 lubridate   * 1.9.3   2023-09-27 [1] CRAN (R 4.3.2)
 ozmaps      * 0.4.5   2021-08-03 [1] CRAN (R 4.3.2)
 patchwork   * 1.1.3   2023-08-14 [1] CRAN (R 4.3.1)
 purrr       * 1.0.2   2023-08-10 [1] CRAN (R 4.3.2)
 readr       * 2.1.4   2023-02-10 [1] CRAN (R 4.3.2)
 sessioninfo * 1.2.2   2021-12-06 [1] CRAN (R 4.3.2)
 sf          * 1.0-14  2023-07-11 [1] CRAN (R 4.3.2)
 showtext    * 0.9-6   2023-05-03 [1] CRAN (R 4.3.2)
 showtextdb  * 3.0     2020-06-04 [1] CRAN (R 4.3.2)
 stringr     * 1.5.1   2023-11-14 [1] CRAN (R 4.3.2)
 sysfonts    * 0.8.8   2022-03-13 [1] CRAN (R 4.3.2)
 tibble      * 3.2.1   2023-03-20 [1] CRAN (R 4.3.2)
 tidyr       * 1.3.0   2023-01-24 [1] CRAN (R 4.3.2)
 tidyverse   * 2.0.0   2023-02-22 [1] CRAN (R 4.3.2)

 [1] C:/Users/KEL329/R-packages
 [2] C:/Users/KEL329/AppData/Local/Programs/R/R-4.3.2/library

──────────────────────────────────────────────────────────────────────────────

Footnotes

  1. There are over half a million records from this genus in the ALA, so restricting our download to records from 2022 significantly speeds things up!↩︎

  2. Using distinct() produces an identical result to count() here, but is far slower because checking for distinct values in the geometry column is computationally intensive. If your dataframe has fewer rows, you could also do this: hex_with_species |> select(hex_id, hex_geometry) |> distinct().↩︎

  3. Each hexagon is formed as a closed (rather than open) polygon, whereby the vertices are joined in the following order: 1-2-3-4-5-6-1. So although there are only 6 vertices, we get 7 sets of coordinates, with the first and seventh sets being duplicated to close the polygon.↩︎

  4. Regular expressions, or regex, are used to match specific patterns in strings. Here, we want to remove the inclusion of subgenera, parentheses, and any extra spaces in species names (e.g. "Melithreptus (Melithreptus) affinis" to "Melithreptus affinis"), and we do this using species = str_replace_all(species, "\\(.*?\\)\\s*", ""). We’re looking for a sequence that starts with an opening parenthesis (\\(), is followed by any characters (.*?), and ends with a closing parenthesis (\\)). Any spaces following the closing parenthesis (\\s*) are also matched. Such sequences are replaced with an empty string (""), effectively removing them.↩︎

  5. Here we assign the positions simply with vertex_position = c(1:7), however you can reorder the dataframe or this position vector to have more control over which point in the hexagon each species is assigned. For instance, you might wish to do this to separate similar colours within the hexagon, or to assign the most widely distributed species to the centre point.↩︎