The ability to predict where a species resides is important in conservation ecology, but when a species has very few existing observations (i.e. a data-deficient species), predicting its distribution can be difficult (or impossible) using standard methods for species distributions.
Convex hulls and alpha hulls are two ways to plot the spatial distribution of data-deficient species, making it possible to calculate metrics that help us predict whether a species is threatened to become extinct (i.e. IUCN metrics).
Recently, Dr. Marsh and colleagues used alpha hulls to estimate the impact of the 2020 mega bushfires on invertebrates in southern Australia. Since invertebrate data is inherently sparse, alpha hulls are really useful when you only have a handful of records to work with.
In this post, we’ll explain the difference between convex hulls and alpha hulls and show you how to make them yourself!
Download data
To illustrate the various spatial polygons you can make, let’s look at an invertebrate species from Dr. Marsh’s study: an endemic damselfly, Austroargiolestes calcaris, commonly known as the Powdered Flatwing
First we will load the R packages we’ll need:
# install.packages("pacman")
::p_load(remotes, galah, tidyverse, alphahull,
pacman sp, sf, ozmaps, patchwork)
Now let’s use galah
to download occurrence records from the Atlas of Living Australia (ALA). Note that you will need to first enter a registered email with the ALA using galah_config
before fetching records.
# Add registered email (register at ala.org.au)
galah_config(email = "your-email@email.com")
# Download Powdered flatwing records
galah_call() |>
galah_identify("Austroargiolestes calcaris") |>
galah_filter(profile="ALA") |>
galah_select(group = "basic") |>
atlas_occurrences() -> dfly
Retrying in 1 seconds.
# See first 10 rows
|> head(10L) dfly
# A tibble: 10 × 8
recordID scientificName taxonConceptID decimalLatitude decimalLongitude
<chr> <chr> <chr> <dbl> <dbl>
1 005e7dd7-884a… Austroargiole… https://biodi… -36.7 147.
2 00c274f7-9fbc… Austroargiole… https://biodi… -37.9 145.
3 00eb34ec-c46c… Austroargiole… https://biodi… -37.9 145.
4 00fabc50-3016… Austroargiole… https://biodi… -34.0 150.
5 012c74aa-0fd8… Austroargiole… https://biodi… -37.5 147.
6 013c14be-3d69… Austroargiole… https://biodi… -37.9 145.
7 0170fe4b-a054… Austroargiole… https://biodi… -36.9 148.
8 017fc3d9-e8c2… Austroargiole… https://biodi… -37.5 146.
9 0213326f-b3da… Austroargiole… https://biodi… -37.7 146.
10 02236ffc-4351… Austroargiole… https://biodi… -37.7 146.
# ℹ 3 more variables: eventDate <dttm>, occurrenceStatus <chr>,
# dataResourceName <chr>
Before we can plot anything, we will also need to remove all duplicated values and any NA values!
# Remove duplicates & NAs
|>
dfly filter(!duplicated(decimalLongitude) & !duplicated(decimalLatitude)) |>
filter(!is.na(decimalLongitude) & !is.na(decimalLatitude) ) -> dfly_clean
Map a convex hull
A convex hull is a way to draw around all the points of a species on a map with as few lines as possible. It’s defined as the smallest polygon that encloses all the points in the data set.
To plot a convex hull on a map, we can use chull()
to compute a convex hull from our cleaned Powdered Flatwing data. chull()
computes a series of points that make up our convex hull.
# Compute convex hull
|>
dfly_clean ::select(decimalLongitude, decimalLatitude) |>
dplyrchull() -> dfly_chull
dfly_chull
[1] 284 245 98 405 383 84 221 264 105 87 552
Next, we join the first point of the hull vector to the last point, creating a closed outline which can be plotted on a map.
# Join first point of hull to the last point
<- c(dfly_chull, dfly_chull[1]) dfly_chull_pts
Now we can get a map of Australia from the {ozmaps}
package and use st_transform()
to make sure it has the correct projection of 4326
.
# Get map of Australia
<- st_transform(ozmaps::ozmap_country, 4326) aus
And finally, we plot our Powdered Flatwing occurrence records and its convex hull on a map!
# Plot occurrences and convex hull
ggplot() +
geom_sf(data = aus,
colour = "black",
fill = "white") +
geom_point(data = dfly_clean,
mapping = aes(decimalLongitude, decimalLatitude),
colour = "black", size = 0.8) +
geom_polygon(data = dfly_clean[dfly_chull_pts, ],
mapping = aes(decimalLongitude, decimalLatitude),
fill = "orange",
colour = "black",
alpha = 0.5) +
coord_sf(xlim=c(142, 152),
ylim=c(-32,-44)) +
labs(title = "Convex hull",
x = "Longtitude (DD)",
y = "Latitude (DD)") + ## DD here stands for decimal degrees
theme_bw()
What is an alpha hull?
Like a convex hull, an alpha hull is also a way to draw the smallest polygon that encloses all the points in a data set. However, alpha hulls differ because they use an alpha parameter to control how tightly the boundary fits around a set of points. This method creates concave, arched edges that fit around occurrence records more tightly. A tighter boundary around our points helps us avoid over-predicting the range of a species.
To illustrate, here are three alpha hulls with increasing values for alpha. Notice as the alpha value increases, the tightness of our boundary decreases.
Code
# Compute alpha shapes and store in list column within a tibble
tibble(
alpha_value = c(1, 2, 5),
ahull_ls = map(.x = c(1, 2, 5),
.f = ~ dfly_clean |>
select(decimalLongitude, decimalLatitude) |>
ahull(alpha = .x))
-> dfly_ahulls
)
# Transform alpha hull to an `sp` object and set map projection to 4326
<- function(sp_obj){
set_map_proj @proj4string <- sp::CRS("EPSG:4326")
sp_obj
sp_obj
}
|>
dfly_ahulls mutate(ahull_sp = map(.x = ahull_ls,
.f = hull2spatial::ahull2poly),
ahull_sp = map(.x = ahull_sp,
.f = set_map_proj)
-> dfly_ahulls
)
# Transform `sp` object into a `sf` object
|>
dfly_ahulls mutate(ahull_sf = map(.x = ahull_sp,
.f = st_as_sf)
-> dfly_ahulls
)
# Transform occurrences into `sf` object for plotting
st_as_sf(dfly_clean,
coords = c("decimalLongitude", "decimalLatitude"),
crs = 4326) -> dfly_sf
## A function to compose map
<- function(ahull_sf, title = NULL){
plot_ahull_fun <- ggplot() +
p geom_sf(data = aus, colour = "black", fill = "white") +
geom_sf(data = dfly_sf, colour = "black", size = 0.5) +
geom_sf(data = ahull_sf, fill = "orange", alpha = 0.5) +
coord_sf(xlim=c(142, 152),ylim=c(-32,-44)) +
ggtitle(paste("a = ", as.character(title))) +
labs(x = "Longtitude (DD)", y = "Latitude (DD)") +
theme_bw(base_size = 12)
p
}
|>
dfly_ahulls mutate(ahull_maps = map2(.x = ahull_sf,
.y = alpha_value,
.f = ~ plot_ahull_fun(.x , .y))
-> dfly_ahulls )
Alpha = 2 is the alpha value we’ve most commonly come across in research, and is the value recommended by the IUCN for various forms of species vulnerability analysis.
So, let’s learn how to make the a = 2 plot above!
Map an alpha hull
To make an alpha hull, we will rely on the hull2spatial
package (developed by Cecina Babich Morrow). This package allows us to convert “ahull
” objects into ggplot-friendly objects (to learn more, check out their blog post about the package).
Install the package from GitHub using:
::install_github("babichmorrowc/hull2spatial")
remoteslibrary(hull2spatial)
To compute our alpha hull, we’ll provide the longitude and latitude coordinates of our data points to the ahull()
function, and set alpha = 2
. ahull()
creates a list
object with far more complexity than our convex hull. A nice way to understand the difference is to look at the first 5 rows of the arcs
component of our list dfly_ahull
, which stores information like the center and radius of each arch in our alpha hull.
# Compute an alpha hull
|>
dfly_clean ::select(decimalLongitude, decimalLatitude) |>
dplyrahull(alpha = 2) -> dfly_ahull
# See first 5 values of `arcs` component of list
$arcs |> head(5L) dfly_ahull
c1 c2 r v.x v.y theta end1 end2
[1,] 143.7540 -36.13036 2 0.7915500 -0.6111044 8.326528e-02 221 264
[2,] 145.2171 -35.22839 2 0.5608382 -0.8279254 4.870812e-01 264 524
[3,] 146.5099 -34.22352 2 0.5731316 -0.8194633 3.713225e-01 524 203
[4,] 148.1218 -33.33548 2 0.6121914 -0.7907096 6.333623e-01 203 105
[5,] 151.0611 -32.16307 2 -0.5081885 -0.8612458 4.673462e-05 105 87
Next we’ll transform our alpha hull and occurrence points into spatial objects for plotting.
The ahull2poly()
function converts our alpha hull to one type of spatial object (an sp
object), but we’ll use st_as_sf()
to convert our result to an sf
object (because it’s easier to plot) and set our map projection to 4326
. We’ll do the same for our damselfly occurrence points.
# Transform `ahull` into spatial object, convert to sf, set coordinates
::ahull2poly(dfly_ahull) |>
hull2spatialst_as_sf() |>
st_set_crs(st_crs(aus)) -> dfly_sf_ahull
# Convert occurrence points to `sf` for plotting
|>
dfly_clean st_as_sf(coords = c("decimalLongitude", "decimalLatitude"),
crs = 4326) -> dfly_sf
Finally, we can create our plot!
# Plot the occurrences and alpha hull
ggplot() +
geom_sf(data = aus, colour = "black", fill = "white") +
geom_sf(data = dfly_sf, colour = "black", size = 0.5) +
geom_sf(data = dfly_sf_ahull, fill = "orange", alpha = 0.5) +
coord_sf(xlim=c(142, 152),ylim=c(-32,-44)) +
ggtitle("Alpha hull") +
labs(x = "Longtitude (DD)", y = "Latitude (DD)") +
theme_bw()
Alpha hull with filtered observations
Citizen science data are often excluded from scientific analyses due to poor data quality e.g. rare species can be misidentified by someone who’s not an expert. Although a strict data criteria will reduce the number of data points, we can still compute and plot alpha hulls for the Powdered Flatwing - this is the beauty of them!
Let’s repeat exactly the same steps as above for generating an alpha hull, but only use a subset of our damselfly observations that comes from specimen data. We can do this by specifying the basisOfRecord
with galah_filter()
.
# Create a vector excluding human observations
<- c("PRESERVED_SPECIMEN", "LIVING_SPECIMEN",
institution_only "MACHINE_OBSERVATION", "MATERIAL_SAMPLE")
galah_call() |>
galah_identify("Austroargiolestes calcaris") |>
galah_filter(basisOfRecord == institution_only,
profile = "ALA") |>
galah_select(group = "basic") |>
atlas_occurrences() -> dfly_specionly
Below is our alpha hull of our specimen-only damselfly data. You’ll notice that there are two separate hulls in this map! This is another benefit of using an alpha hull over a convex hull. The mathematical constraints of a convex hull mean all points must be contained within a single polygon - this can lead to an over-estimation of a species’ range.
A real-life example
Alpha hulls, and their ability to generate multiple hulls when data is sparse, can help us understand how sensitive certain species are to environmental change over time, even when species have few existing observations.
For example, Dr. Takach and their team wanted to investigate how the distribution of mammals in the tropical savanna like Mesembriomys gouldii (the Black-Footed Tree Rat) shrink or expand in response to pressures like habitat loss and changing climate. Using alpha hulls, they found that the ecological niche of this species has shrunk due to a loss of suitable habitat over time.
The published paper didn’t provide a visual of this species’ distribution, so we’ve made a map below with ALA data to show the change in distribution over time:
Code
# Download records
galah_call() |>
galah_identify("Mesembriomys gouldii") |>
galah_filter(profile = "ALA") |>
galah_select(group = "basic") |>
atlas_occurrences() -> tree_rat
Retrying in 1 seconds.
Retrying in 2 seconds.
Code
# Remove duplicates and NAs
|>
tree_rat filter(! duplicated(decimalLongitude) & ! duplicated(decimalLatitude)) |>
filter(! is.na(decimalLongitude) & ! is.na(decimalLatitude) ) -> tree_ratclean
# Convert occurrence points to sf for plotting
|>
tree_ratclean st_as_sf(coords = c("decimalLongitude", "decimalLatitude"),
crs = 4326) -> tree_rat_sf
# Compute alpha hull
|>
tree_ratclean select(decimalLongitude, decimalLatitude) |>
ahull(alpha = 2) -> tree_rat_ahull
# Transform `ahull` to `sf`, set projection
::ahull2poly(tree_rat_ahull) |>
hull2spatialst_as_sf() |>
st_set_crs(st_crs(aus)) -> tree_rat_sf_ahull
# Convert occurrence points to sf for plotting
|>
tree_ratclean st_as_sf(coords = c("decimalLongitude", "decimalLatitude"),
crs = 4326) -> tree_rat_sf
# Get map of Australia & set projection
<- st_transform(ozmaps::ozmap_country, 4326)
aus
#---
# Download tree rat records after 2000
galah_call() |>
galah_identify("Mesembriomys gouldii") |>
galah_filter(profile = "ALA",
>= 2000,) |>
year galah_select(group = "basic") |>
atlas_occurrences() -> Rtree_rat
Retrying in 1 seconds.
Code
# Remove duplicates & exclude NAs
|>
Rtree_rat filter(!duplicated(decimalLongitude) & !duplicated(decimalLatitude)) |>
filter(!is.na(decimalLongitude) & !is.na(decimalLatitude) ) -> Rtree_ratclean
# Compute an alpha hull for our specimen only occurrences
|>
Rtree_ratclean select(decimalLongitude, decimalLatitude) |>
ahull(alpha = 2) -> Rtree_rat_ahull
# Transform `ahull` to sf, set coordinates
::ahull2poly(Rtree_rat_ahull) |>
hull2spatialst_as_sf() |>
st_set_crs(st_crs(aus)) -> Rtree_rat_sf_ahull
# Transform occurrence points to sf for plotting
|>
Rtree_ratclean st_as_sf(coords = c("decimalLongitude", "decimalLatitude"),
crs = 4326) -> Rtree_rat_sf
#---
# Nice title
<- expression(italic("Mesembriomys gouldii "), "alpha hulls")
rat_title
# Plot!
ggplot() +
geom_sf(data = aus, colour = "black", fill = "white") +
geom_sf(data = tree_rat_sf_ahull, aes(fill = "chartreuse3") ,alpha = 0.5, colour = "black", position = "identity") +
geom_sf(data = Rtree_rat_sf_ahull, aes(fill = "blueviolet"), alpha = 0.5, colour = "black", position = "identity") +
scale_fill_identity(guide = "legend",
name = "Record date ranges",
labels = c('2000 Onwards', 'All Records')) +
guides(colour = guide_legend(override.aes = list(alpha = 0.1))) +
coord_sf(xlim=c(125, 145),ylim=c(-20,-10)) +
ggtitle(rat_title) +
labs(x = "Longtitude (DD)", y = "Latitude (DD)") +
theme_bw() +
theme(legend.position = "bottom") -> combinedtree_rat_ahull_p
Final thoughts
While sophisticated tools for modelling species distribution exist, they require a lot of data to make reliable estimates. Convex polygons and alpha hulls are flexible alternatives that can help us understand dynamic changes to distributions of understudied or vulnerable data-deficient species.
Expand for session info
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.3.2 (2023-10-31 ucrt)
os Windows 10 x64 (build 19045)
system x86_64, mingw32
ui RTerm
language (EN)
collate English_Australia.utf8
ctype English_Australia.utf8
tz Australia/Sydney
date 2024-02-12
pandoc 3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
─ Packages ───────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
alphahull * 2.5 2022-06-16 [1] CRAN (R 4.3.2)
dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.3.2)
forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.2)
galah * 2.0.1 2024-02-06 [1] CRAN (R 4.3.2)
ggplot2 * 3.4.4 2023-10-12 [1] CRAN (R 4.3.1)
htmltools * 0.5.7 2023-11-03 [1] CRAN (R 4.3.2)
hull2spatial * 0.1.0 2022-10-12 [1] Github (babichmorrowc/hull2spatial@921594f)
lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.3.2)
ozmaps * 0.4.5 2021-08-03 [1] CRAN (R 4.3.2)
pacman * 0.5.1 2019-03-11 [1] CRAN (R 4.3.2)
patchwork * 1.1.3 2023-08-14 [1] CRAN (R 4.3.1)
purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.2)
readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.2)
remotes * 2.4.2.1 2023-07-18 [1] CRAN (R 4.3.2)
sessioninfo * 1.2.2 2021-12-06 [1] CRAN (R 4.3.2)
sf * 1.0-14 2023-07-11 [1] CRAN (R 4.3.2)
sp * 2.1-2 2023-11-26 [1] CRAN (R 4.3.2)
stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.3.2)
tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.2)
tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.2)
tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.2)
xaringanExtra * 0.7.0 2022-07-16 [1] CRAN (R 4.3.2)
[1] C:/Users/KEL329/R-packages
[2] C:/Users/KEL329/AppData/Local/Programs/R/R-4.3.2/library
──────────────────────────────────────────────────────────────────────────────