Visualisations of species distributions can be very simple yet effective ways of conveying biological and ecological information, such as range, habitat, and relative population size.
Representing more than one species distribution in a single figure can be difficult, though, especially where there are areas of overlap. Points and colour-filled polygons will obscure each other even with a degree of transparency, while densities and shaded regions can only show one species at a time.
Here, we demonstrate a method to visualise distributions of multiple species with overlapping ranges on the same map, with only a small loss in resolution. The technique is a novel twist on the commonly used hexbin map: instead of using a colour fill to represent presence/absence or counts within each hexagon, we use multiple coloured points within each hexagon to represent presence/absence of species, allowing users to get a broad overview of how multiple species are distributed across an area.
This method requires a number of steps to build up the elements of the final figure:
- Getting occurrence records for species of interest
- Creating a hex grid over the area of occupancy for those species
- Assigning locations and colour to each species within each hex
The final figure will comprise a combination of the basemap, hex grid, and species points once these elements are created.
Let’s begin by loading the R packages we’ll be using.
library(galah)
library(ggtext)
library(ozmaps)
library(sf)
library(showtext)
library(stringr)
library(tidyverse)
We’ll use the {galah} package to download occurrence records from the Atlas of Living Australia (ALA). To do this, you’ll need to register your email address with the ALA, then pass it to {galah} using galah_config()
.
galah_config(email = "your-email@email.com")
Download data
Since our goal here is to map distributions of multiple species, we’ve chosen honeyeaters from the genus Melithreptus: this is a distinctive group of 7 small- to medium-sized, short-billed and square-tailed honeyeaters with overlapping distributions across Australia.
We can get taxonomic information about this group using atlas_species()
…
<- galah_call() |>
melithreptus galah_identify("Melithreptus") |>
atlas_species()
melithreptus
# A tibble: 7 × 10
species_guid species author kingdom phylum class order family genus
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 https://biodiversity.o… Melith… (Viei… Animal… Chord… Aves Pass… Melip… Meli…
2 https://biodiversity.o… Melith… (Vigo… Animal… Chord… Aves Pass… Melip… Meli…
3 https://biodiversity.o… Melith… Gould… Animal… Chord… Aves Pass… Melip… Meli…
4 https://biodiversity.o… Melith… (Goul… Animal… Chord… Aves Pass… Melip… Meli…
5 https://biodiversity.o… Melith… (Less… Animal… Chord… Aves Pass… Melip… Meli…
6 https://biodiversity.o… Melith… (Goul… Animal… Chord… Aves Pass… Melip… Meli…
7 https://biodiversity.o… Melith… Gould… Animal… Chord… Aves Pass… Melip… Meli…
# ℹ 1 more variable: vernacular_name <chr>
… and then use this information to download occurrence records for the 7 species. We’ll apply a general set of ALA data quality filters to remove low quality records with galah_apply_profile()
, and pass in the list of species we’re interested in with galah_identify()
. We’ll also filter records to 20221, choosing only those with spatial coordinates and that fall within one of the IBRA bioregions (as a proxy for Australian records only).
<- galah_call() |>
species_occ galah_apply_profile(ALA) |>
galah_identify(melithreptus$species) |>
galah_filter(year == 2022,
!is.na(cl1048), # IBRA bioregions
!is.na(decimalLatitude),
!is.na(decimalLongitude)) |>
galah_select(decimalLatitude,
decimalLongitude,
species, |>
scientificName) atlas_occurrences()
head(species_occ)
# A tibble: 6 × 4
decimalLatitude decimalLongitude species scientificName
<dbl> <dbl> <chr> <chr>
1 -43.6 147. Melithreptus validirostris Melithreptus (Eid…
2 -43.6 147. Melithreptus validirostris Melithreptus (Eid…
3 -43.5 146. Melithreptus validirostris Melithreptus (Eid…
4 -43.5 147. Melithreptus affinis Melithreptus (Mel…
5 -43.5 147. Melithreptus affinis Melithreptus (Mel…
6 -43.5 147. Melithreptus validirostris Melithreptus (Eid…
Since we’re going to be performing a few spatial operations to assign species to hexagons, let’s convert the species_occ
dataframe into a simple features (sf
) object, with latitude and longitude columns represented as points in a geometry column named occ_geometry
.
<- species_occ |>
species_occ_sf st_as_sf(coords = c("decimalLongitude", "decimalLatitude"),
crs = 4326) |>
st_set_geometry("occ_geometry")
head(species_occ_sf)
Simple feature collection with 6 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 146.1362 ymin: -43.59761 xmax: 147.1435 ymax: -43.48602
Geodetic CRS: WGS 84
# A tibble: 6 × 3
species scientificName occ_geometry
<chr> <chr> <POINT [°]>
1 Melithreptus validirostris Melithreptus (Eidopsarus… (146.8627 -43.59761)
2 Melithreptus validirostris Melithreptus (Eidopsarus… (146.8627 -43.59761)
3 Melithreptus validirostris Melithreptus (Eidopsarus… (146.1362 -43.50299)
4 Melithreptus affinis Melithreptus (Melithrept… (147.1435 -43.49038)
5 Melithreptus affinis Melithreptus (Melithrept… (146.9424 -43.48602)
6 Melithreptus validirostris Melithreptus (Eidopsarus… (146.9424 -43.48602)
Generate hex grid
Next, we’ll set up a grid of hexagons across Australia, which we’ll use as bins for plotting summaries of species occurrence.
st_make_grid()
can make a grid that covers the bounding box of the supplied shapefile (here the ozmap_country
shapefile), with arguments for specifying the size, type, and orientation of polygons in a grid. We’ll transform the projection to match the coordinate reference system we set for the species occurrence records above (4326
), and assign a unique identifier to each hexagon in a column named hex_id
.
<- st_make_grid(ozmap_country,
hex_grid cellsize = 2,
what = "polygons",
square = FALSE,
flat_topped = TRUE) |>
st_as_sf() |>
st_set_geometry("hex_geometry") |>
st_transform(4326) |>
rowid_to_column(var = "hex_id")
hex_grid
Simple feature collection with 703 features and 1 field
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 104.396 ymin: -45.63203 xmax: 170.7912 ymax: -7.632027
Geodetic CRS: WGS 84
First 10 features:
hex_id hex_geometry
1 1 POLYGON ((106.128 -44.63203...
2 2 POLYGON ((109.5921 -44.6320...
3 3 POLYGON ((113.0562 -44.6320...
4 4 POLYGON ((116.5203 -44.6320...
5 5 POLYGON ((119.9844 -44.6320...
6 6 POLYGON ((123.4485 -44.6320...
7 7 POLYGON ((126.9126 -44.6320...
8 8 POLYGON ((130.3767 -44.6320...
9 9 POLYGON ((133.8408 -44.6320...
10 10 POLYGON ((137.3049 -44.6320...
Our grid of hexagons looks like this:
Code
ggplot() +
geom_sf(data = ozmap_states,
colour = "darkgrey",
fill = NA,
linewidth = 0.3) +
geom_sf(data = hex_grid,
fill = NA,
col = "deepskyblue4",
linewidth = 0.2) +
theme_void()
Remove empty hexes
You’ve probably noticed there are a lot of redundant hexagons in the grid we just created. Not every terrestrial hexagon will contain an occurrence record, and we can confidently assume hexagons in the ocean will not contain records of honeyeaters.
We’ll remove these empty hexagons with a spatial join (which behaves similarly to dplyr::left_join()
for spatial objects). This returns a dataframe that has all the information from our original occurrence download, where each row is a record of a species in a particular location, but each record’s point location has now been matched to a hexagon from the grid we just created.
<- st_join(x = hex_grid,
hex_with_species y = species_occ_sf,
join = st_intersects,
left = FALSE)
head(hex_with_species, n = 10)
Simple feature collection with 10 features and 3 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 145.9652 ymin: -44.63203 xmax: 148.2746 ymax: -42.63203
Geodetic CRS: WGS 84
hex_id species scientificName
32 32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.1 32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.2 32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.3 32 Melithreptus affinis Melithreptus (Melithreptus) affinis
32.4 32 Melithreptus affinis Melithreptus (Melithreptus) affinis
32.5 32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.6 32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.7 32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.8 32 Melithreptus validirostris Melithreptus (Eidopsarus) validirostris
32.9 32 Melithreptus affinis Melithreptus (Melithreptus) affinis
hex_geometry
32 POLYGON ((145.9652 -43.6320...
32.1 POLYGON ((145.9652 -43.6320...
32.2 POLYGON ((145.9652 -43.6320...
32.3 POLYGON ((145.9652 -43.6320...
32.4 POLYGON ((145.9652 -43.6320...
32.5 POLYGON ((145.9652 -43.6320...
32.6 POLYGON ((145.9652 -43.6320...
32.7 POLYGON ((145.9652 -43.6320...
32.8 POLYGON ((145.9652 -43.6320...
32.9 POLYGON ((145.9652 -43.6320...
This means any hexagons we initially created in the grid that don’t intersect with occurrence records have been removed:
Code
ggplot() +
geom_sf(data = ozmap_states,
colour = "darkgrey",
fill = NA,
linewidth = 0.3) +
geom_sf(data = count(hex_with_species, hex_id, hex_geometry),
fill = NA,
col = "deepskyblue4",
linewidth = 0.4) +
theme_void()
Visualising multiple species in a hexagon
As some hexagons will contain occurrence records for more than one species, we need a way to display these overlaps. We’ll do this by setting up 7 positions in each hexagon, 1 for each species, and assign each species a position and colour so they can be visually differentiated.
The figure below summarises the process we’ll follow: for each hexagon remaining in the grid, we’ll generate a smaller hexagon, then get the coordinates of each vertex and centroid of the smaller hexagon. This gives us 7 positions to display up to 7 species in each hexagon.
Set up 7 points
Let’s start by extracting the unique identifiers and spatial coordinates for every hexagon containing an occurrence record2. Each hex_id
refers to one of the remaining hexagons in our grid. This is step 1 from the figure above.
<- hex_with_species |>
unique_hex count(hex_id, hex_geometry) |>
select(-`n`)
unique_hex
Simple feature collection with 157 features and 1 field
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 113.0562 ymin: -44.63203 xmax: 155.2028 ymax: -9.632027
Geodetic CRS: WGS 84
First 10 features:
hex_id hex_geometry
1 32 POLYGON ((145.9652 -43.6320...
2 50 POLYGON ((144.2331 -42.6320...
3 51 POLYGON ((147.6972 -42.6320...
4 70 POLYGON ((145.9652 -41.6320...
5 88 POLYGON ((144.2331 -40.6320...
6 89 POLYGON ((147.6972 -40.6320...
7 107 POLYGON ((142.5011 -39.6320...
8 108 POLYGON ((145.9652 -39.6320...
9 125 POLYGON ((140.769 -38.63203...
10 126 POLYGON ((144.2331 -38.6320...
Next, we’ll work through steps 2 - 4. Let’s create a smaller hexagon within each original hex using st_buffer()
, extract the coordinates of its vertices using st_coordinates()
, and assign an integer to each vertex ranging from 1 to 73. We’ve created an anonymous function to pipe these steps together, and used pmap()
to apply this function iteratively to every hexagon in the grid.
We use the dist
argument in st_buffer()
to define the size of the smaller hexagon, but this depends on the cellsize
of the original larger hexagon (in the six-hexagon figure, cellsize
= 2). Depending on the number of species you’d like to fit within each polygon and the shape of the polygon you’ve chosen, you may need to try out different values of cellsize
and dist
to find combinations that work best for your visualisation.
<- unique_hex |>
vertex_coords mutate(vertices = pmap(
.l = list(x = hex_geometry),
.f = function(x) {
|>
x st_buffer(dist = -0.4) |> # STEP 2: set size of smaller hex
st_coordinates() |> # STEP 3: get vertex coordinates of smaller hex
as_tibble() |> # convert matrix to tibble
st_as_sf(coords = c("X", "Y")) |> # convert tibble to simple features
select(-L1, -L2) |> # remove unnecessary columns
mutate(vertex_position = 1:7) # STEP 4: number vertices
|>
})) unnest(cols = vertices)
head(vertex_coords, n = 10)
Simple feature collection with 10 features and 2 fields
Active geometry column: hex_geometry
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 144.2331 ymin: -44.63203 xmax: 148.2746 ymax: -41.63203
Geodetic CRS: WGS 84
# A tibble: 10 × 4
hex_id hex_geometry geometry vertex_position
<int> <POLYGON [°]> <POINT> <int>
1 32 ((145.9652 -43.63203, 146.5… (146.4271 -43.63203) 1
2 32 ((145.9652 -43.63203, 146.5… (146.7735 -43.03203) 2
3 32 ((145.9652 -43.63203, 146.5… (147.4663 -43.03203) 3
4 32 ((145.9652 -43.63203, 146.5… (147.8127 -43.63203) 4
5 32 ((145.9652 -43.63203, 146.5… (147.4663 -44.23203) 5
6 32 ((145.9652 -43.63203, 146.5… (146.7735 -44.23203) 6
7 32 ((145.9652 -43.63203, 146.5… (146.4271 -43.63203) 7
8 50 ((144.2331 -42.63203, 144.8… (144.695 -42.63203) 1
9 50 ((144.2331 -42.63203, 144.8… (145.0414 -42.03203) 2
10 50 ((144.2331 -42.63203, 144.8… (145.7342 -42.03203) 3
In the resulting dataframe, the hex_id
and hex_geometry
columns contain the unique ID and geometry of the original large hexagons from the grid. Meanwhile, the remaining columns contain information for our newly created smaller hexagons: the geometry
column contains the spatial coordinates of each hexagon’s respective corner vertex, and vertex_position
identifies each vertex point.
We’d like to show information of 7 species in our hexagon, but despite having 7 points in vertex_position
, the 7th point contains the same information as the 1st point. This is so that, when drawn by lines, the hexagon is closed. However, because we are only interested drawing points, we can mutate the duplicated row of the 7th vertex to hold the coordinates of the centroid of each hexagon instead. This will gives us seven distinct positions (step 5 in our six-hexagon figure).
<- vertex_coords |>
vertex_centroid_coords mutate(geometry = ifelse(vertex_position == 7,
st_centroid(hex_geometry),
|>
geometry)) st_drop_geometry() |>
st_as_sf(crs = 4326)
head(vertex_centroid_coords, n = 10)
Simple feature collection with 10 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 144.695 ymin: -44.23203 xmax: 147.8127 ymax: -42.03203
Geodetic CRS: WGS 84
# A tibble: 10 × 3
hex_id geometry vertex_position
<int> <POINT [°]> <int>
1 32 (146.4271 -43.63203) 1
2 32 (146.7735 -43.03203) 2
3 32 (147.4663 -43.03203) 3
4 32 (147.8127 -43.63203) 4
5 32 (147.4663 -44.23203) 5
6 32 (146.7735 -44.23203) 6
7 32 (147.1199 -43.63027) 7
8 50 (144.695 -42.63203) 1
9 50 (145.0414 -42.03203) 2
10 50 (145.7342 -42.03203) 3
Assign species to positions
The melithreptus
dataframe (created earlier using atlas_species()
) requires a small amount of tidying to be compatible with the rest of our data. The species
column contains subgenera, which we can remove with regular expressions (regex)4. We also need to ensure all species have a vernacular name, noting that Melithreptus chloropsis is currently lacking one in the ALA database. We can then assign a number (1-7) denoting each species’ position in a hexagon5.
<- melithreptus |>
species_data select(species, vernacular_name) |>
mutate(species = str_replace_all(species, "\\(.*?\\)\\s*", ""),
vernacular_name = if_else(species == "Melithreptus chloropsis",
"Gilbert's Honeyeater",
vernacular_name),vertex_position = c(1:7))
species_data
# A tibble: 7 × 3
species vernacular_name vertex_position
<chr> <chr> <int>
1 Melithreptus lunatus White-naped Honeyeater 1
2 Melithreptus brevirostris Brown-headed Honeyeater 2
3 Melithreptus albogularis White-throated Honeyeater 3
4 Melithreptus gularis Black-chinned Honeyeater 4
5 Melithreptus affinis Black-headed Honeyeater 5
6 Melithreptus validirostris Strong-billed Honeyeater 6
7 Melithreptus chloropsis Gilbert's Honeyeater 7
Our final step is to bring these three dataframes (hex_with_species
, species_data
, vertex_centroid_coords
) together with dplyr::left_join()
.
We begin by joining our distinct hexagon and species combinations (hex_with_species
) with our species positions and common names (species_data
) using the species
column…
<- hex_with_species |>
species_points_a st_drop_geometry() |>
select(hex_id, species) |>
distinct() |>
left_join(species_data,
by = join_by(species))
head(species_points_a, n = 10)
hex_id species vernacular_name vertex_position
1 32 Melithreptus validirostris Strong-billed Honeyeater 6
2 32 Melithreptus affinis Black-headed Honeyeater 5
3 50 Melithreptus validirostris Strong-billed Honeyeater 6
4 50 Melithreptus affinis Black-headed Honeyeater 5
5 51 Melithreptus validirostris Strong-billed Honeyeater 6
6 51 Melithreptus affinis Black-headed Honeyeater 5
7 70 Melithreptus affinis Black-headed Honeyeater 5
8 70 Melithreptus validirostris Strong-billed Honeyeater 6
9 88 Melithreptus validirostris Strong-billed Honeyeater 6
10 88 Melithreptus affinis Black-headed Honeyeater 5
…and follow this with another join to get the point coordinates of each species’ point in each hexagon (vertex_centroid_coords
), using the vertex_position
and hex_id
columns.
<- species_points_a |>
species_points left_join(vertex_centroid_coords,
by = join_by(vertex_position, hex_id)) |>
st_as_sf(crs = 4326)
head(species_points, n = 10)
Simple feature collection with 10 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 145.0414 ymin: -44.23203 xmax: 149.1983 ymax: -41.23203
Geodetic CRS: WGS 84
hex_id species vernacular_name vertex_position
1 32 Melithreptus validirostris Strong-billed Honeyeater 6
2 32 Melithreptus affinis Black-headed Honeyeater 5
3 50 Melithreptus validirostris Strong-billed Honeyeater 6
4 50 Melithreptus affinis Black-headed Honeyeater 5
5 51 Melithreptus validirostris Strong-billed Honeyeater 6
6 51 Melithreptus affinis Black-headed Honeyeater 5
7 70 Melithreptus affinis Black-headed Honeyeater 5
8 70 Melithreptus validirostris Strong-billed Honeyeater 6
9 88 Melithreptus validirostris Strong-billed Honeyeater 6
10 88 Melithreptus affinis Black-headed Honeyeater 5
geometry
1 POINT (146.7735 -44.23203)
2 POINT (147.4663 -44.23203)
3 POINT (145.0414 -43.23203)
4 POINT (145.7342 -43.23203)
5 POINT (148.5055 -43.23203)
6 POINT (149.1983 -43.23203)
7 POINT (147.4663 -42.23203)
8 POINT (146.7735 -42.23203)
9 POINT (145.0414 -41.23203)
10 POINT (145.7342 -41.23203)
Map
Let’s check how our three spatial layers—basemap, hexagons, and species points—look on a map.
ggplot() +
geom_sf(data = ozmap_states, fill = NA) +
geom_sf(data = unique_hex, fill = NA) +
geom_sf(data = species_points, aes(colour = vernacular_name)) +
lims(x = c(112, 155), y = c(-46, -8)) +
theme_void()
This all looks correct! Now to add some final flourishes to make our map more aesthetically pleasing, as well as more accessible with a colourblind friendly palette by Paul Tol.
Code
font_add_google("Montserrat")
showtext_auto(enable = TRUE)
<- c("#88CCEE", "#CC6677", "#DDCC77", "#117733", "#332288", "#AA4499", "#44AA99")
tol_muted
ggplot() +
geom_sf(data = ozmap_states,
fill = NA, colour = "#ababab", linewidth = 0.3) +
geom_sf(data = unique_hex,
fill = "#efefef55", colour = "#777777", linewidth = 0.5) +
geom_sf(data = species_points, aes(colour = vernacular_name),
size = 2.3) +
scale_colour_manual(
values = tol_muted,
guide = guide_legend(title = "*Melithreptus*  species",
override.aes = list(size = 4))
+
) lims(x = c(112, 155), y = c(-46, -8)) +
theme_void() +
theme(legend.title = element_markdown(family = "Montserrat", size = 24),
legend.text = element_text(family = "Montserrat", size = 20),
legend.spacing.x = unit(0, "in"))
Final Thoughts
This visualisation is a novel way to show range overlaps and distributions of multiple species at once. A key strength is the consistency of the repeatable hex unit—the fixed positions and colours of the species points make it easy to follow patterns within or between species.
This is also a very flexible method. It’s easy to customise:
- the size, shape (hexagons vs squares) and orientation of the polygons
- the colours and orientations of points within the hexagons, and
- the spatial scale of the base map
Consider also that you do not necessarily need to use exactly seven different species/taxa—with a bit of creativity, it is possible to fit any number of points from 2-9 into a hexagon (2-7) or square symmetrically…
Expand for session info
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.3.2 (2023-10-31 ucrt)
os Windows 10 x64 (build 19045)
system x86_64, mingw32
ui RTerm
language (EN)
collate English_Australia.utf8
ctype English_Australia.utf8
tz Australia/Sydney
date 2024-02-12
pandoc 3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
─ Packages ───────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.3.2)
forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.2)
galah * 2.0.1 2024-02-06 [1] CRAN (R 4.3.2)
ggplot2 * 3.4.4 2023-10-12 [1] CRAN (R 4.3.1)
ggtext * 0.1.2 2022-09-16 [1] CRAN (R 4.3.2)
htmltools * 0.5.7 2023-11-03 [1] CRAN (R 4.3.2)
lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.3.2)
ozmaps * 0.4.5 2021-08-03 [1] CRAN (R 4.3.2)
patchwork * 1.1.3 2023-08-14 [1] CRAN (R 4.3.1)
purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.2)
readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.2)
sessioninfo * 1.2.2 2021-12-06 [1] CRAN (R 4.3.2)
sf * 1.0-14 2023-07-11 [1] CRAN (R 4.3.2)
showtext * 0.9-6 2023-05-03 [1] CRAN (R 4.3.2)
showtextdb * 3.0 2020-06-04 [1] CRAN (R 4.3.2)
stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.3.2)
sysfonts * 0.8.8 2022-03-13 [1] CRAN (R 4.3.2)
tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.2)
tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.2)
tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.2)
[1] C:/Users/KEL329/R-packages
[2] C:/Users/KEL329/AppData/Local/Programs/R/R-4.3.2/library
──────────────────────────────────────────────────────────────────────────────
Footnotes
There are over half a million records from this genus in the ALA, so restricting our download to records from 2022 significantly speeds things up!↩︎
Using
distinct()
produces an identical result tocount()
here, but is far slower because checking for distinct values in the geometry column is computationally intensive. If your dataframe has fewer rows, you could also do this:hex_with_species |> select(hex_id, hex_geometry) |> distinct()
.↩︎Each hexagon is formed as a closed (rather than open) polygon, whereby the vertices are joined in the following order: 1-2-3-4-5-6-1. So although there are only 6 vertices, we get 7 sets of coordinates, with the first and seventh sets being duplicated to close the polygon.↩︎
Regular expressions, or regex, are used to match specific patterns in strings. Here, we want to remove the inclusion of subgenera, parentheses, and any extra spaces in species names (e.g.
"Melithreptus (Melithreptus) affinis"
to"Melithreptus affinis"
), and we do this usingspecies = str_replace_all(species, "\\(.*?\\)\\s*", "")
. We’re looking for a sequence that starts with an opening parenthesis (\\(
), is followed by any characters (.*?
), and ends with a closing parenthesis (\\)
). Any spaces following the closing parenthesis (\\s*
) are also matched. Such sequences are replaced with an empty string (""
), effectively removing them.↩︎Here we assign the positions simply with
vertex_position = c(1:7)
, however you can reorder the dataframe or this position vector to have more control over which point in the hexagon each species is assigned. For instance, you might wish to do this to separate similar colours within the hexagon, or to assign the most widely distributed species to the centre point.↩︎