Each species has a habitat range where it normally lives and can expect to be found over its lifetime. However, individuals of a species rarely stay in the same spot for long periods of time. Just like us, they react to changes in their environment, interactions with other species, and interactions with other individuals.
As a result, it can be useful to see how a distribution of a species changes in space and over time. In marine environments, for example, seemingly small changes in temperature, chemicals and light can result in large changes to a species’ distribution.
Here we will map the distribution of Nudibranchia around Australia each month as an animated map to see how nudibranch distributions change over the year.
This post is inspired by Liam Bailey’s cool (and hilarious) Bigfoot distribution map. You can find his code here.
Download data
Occurrence data
Let’s first download observations of Nudibranchia across Australia.
We’ll load the necessary packages.
library(galah)
library(tidyverse)
library(glue)
library(lubridate)
library(stars) # Raster management
library(ozmaps) # Australian map
library(SSDM) # Linear modelling
library(sdmpredictors) # Environmental variables
library(grDevices) # Colours and fonts
library(maps) # Cities for map
library(tmaptools) # Create plot ratio
library(gifski) # Create GIF
library(knitr) # View GIF
Now we will use the {galah
} package to download observations of Nudibranchia.
You will need to provide a registered email with the ALA to galah_config()
before retrieving records.
# Add registered email (register at ala.org.au)
galah_config(email = "your-email@email.com")
# Download observations
<-
nudibranch_occurrences galah_call() |>
galah_identify("Nudibranchia") |>
galah_filter(country == "Australia") |>
galah_apply_profile(ALA) |> # ALA's set of data cleaning filters
atlas_occurrences()
Environmental variables
Now we will download our environmental variables for our model.
For our Nudibranchia model, we will use 4 common marine environmental variables:
- Sea surface temperature
- Sea surface salinity
- Distance to shore
- Bathymetry
To get them, we’ll use load_layers()
from the {sdmpredictors} package to download our variables as raster layers (geographic layers that have a value per pixel of our variable). We’ll use the rasterstack
argument to combine our layers into one object.
The {sdmpredictors} package has lots of data sets and layers available. Check out their website to learn more.
# Download variables
<- load_layers(layercodes = c("MS_biogeo08_sss_mean_5m",
env "MS_biogeo13_sst_mean_5m",
"MS_biogeo05_dist_shore_5m",
"MS_bathy_5m"),
equalarea = FALSE,
rasterstack = TRUE)
To prepare variable data for our model, we need to crop the geographical boundaries of our data to include only the coast (and surrounding ocean) of Australia. With the help of the {raster} package, we’ll use extent()
to set the outer boundaries and crop()
to remove the land.
# Create extent
<- raster::extent(100, 165, -45, -10)
aus_ext
# Limit environmental variables
<- raster::crop(env, aus_ext)
aus_env
# Check variables
plot(aus_env)
Prepare data
To construct our animated GIF, we can make each “frame” of our animation a species distribution map of each month - that means 12 maps, January to December.
In order to do this, we’ll create a set of custom functions that:
- Filter all observations to only observations of a specific month,
- Run a species distribution model on those observations
- Plot the results onto a map
- Save the maps
By making custom functions for these tasks, we’ll be able to run each function in a loop, letting us do each thing 12 times for each of our 12 months.
At the end, we’ll stitch our 12 maps together and, Voila! We’ll officially be animators (Pixar here we come!).
First we’ll filter out NA
values and duplicates (which might cause our model to error) and extract the month of observation into its own column.
# Clean, filter and convert time series to months
<-
occurrences_clean |>
nudibranch_occurrences filter(!is.na(decimalLatitude) & !is.na(decimalLongitude)) |>
filter(!duplicated(decimalLatitude) & !duplicated(decimalLongitude)) |>
mutate(month = month(eventDate)) |>
select(month, decimalLatitude, decimalLongitude)
head(occurrences_clean)
# A tibble: 6 × 3
month decimalLatitude decimalLongitude
<dbl> <dbl> <dbl>
1 9 -28.1 153.
2 11 -35.0 151.
3 11 -33.8 151.
4 NA -21.8 114.
5 10 -38.4 145.
6 11 -10.5 106.
From here, we’ll make our own function make_months_df()
that filters our overall observations to only those of our chosen month.
# Build function (for each month select the lat and long)
<- function(chosen_month) {
make_months_df <- occurrences_clean %>%
monthly_data filter(month == {{chosen_month}}) %>%
select(decimalLatitude, decimalLongitude)
}
With the help of purrr::map()
we can run a loop over our make_months_df()
function to return our 12 data.frame
s in one list
.
<- c(1:12)
n_months <- purrr::map(n_months, make_months_df)
month_list
1]] # See output of month 1 month_list[[
# A tibble: 2,299 × 2
decimalLatitude decimalLongitude
<dbl> <dbl>
1 -38.3 145.
2 -14.7 145.
3 -30.2 153.
4 -38.4 145.
5 -33.1 152.
6 -38.5 145.
7 -34.0 151.
8 -30.1 153.
9 -33.4 152.
10 -21.9 114.
# ℹ 2,289 more rows
Species Distribution Model
Now that month_list
contains our 12 data.frame
s, we can run some models with them to calculate a distribution surface.
To build our overall Species Distribution Model (SDM) we’ve chosen to use the method used by Liam Bailey in his Bigfoot map. It’s a fairly flexible model that suits our purposes to quickly see where nudibranchs are observed around Australia.
We’ll build another custom function to run these models called run_sdm_model()
for each chosen month.
This SDM method merges the results from several models into one final value using Fisher’s combined probability. It’s by no means the most robust SDM. If you are trying to make a more informative species distribution model, it might be worth considering other methods!
# Species distribution model function
<- function(chosen_month) {
run_sdm_model <- modelling("GLM",
SDM_GLM Occurrences = (chosen_month),
Env = aus_env,
Xcol = "decimalLongitude",
Ycol = "decimalLatitude",
verbose = FALSE
)<- modelling("MARS",
SDM_MARS Occurrences = (chosen_month),
Env = aus_env,
Xcol = "decimalLongitude",
Ycol = "decimalLatitude",
verbose = FALSE
)<- modelling("CTA",
SDM_CTA Occurrences = (chosen_month),
Env = aus_env,
Xcol = "decimalLongitude",
Ycol = "decimalLatitude",
verbose = FALSE
)
# Calculate single value using Fisher's combined probability
<- -2 * (log(SDM_MARS@projection) + log(SDM_GLM@projection) + log(SDM_CTA@projection))
combined <- function(x) {1 - pchisq(q = x, df = 6)}
Chi_sq <- raster::calc(combined, fun = Chi_sq)
combined_pval
# Convert to spatial object
<- stars::st_as_stars(combined_pval)
species_distribution return(species_distribution)
}
Now we can use purrr::map()
to run another loop to return results of our 12 SDMs.
# Run & save models
<- purrr::map(month_list, run_sdm_model) model_list
Map
We now have the results of our 12 SDMs in model_list
. We can use these results to make 12 maps.
To help orient ourselves, let’s download point data of the main cities in Australia from the {maps} package.
<- c("Sydney", "Melbourne", "Brisbane", "Cairns",
city_names "Canberra", "Adelaide", "Melbourne", "Perth", "Darwin")
<- world.cities |>
cities filter(country.etc == "Australia") |>
filter(name %in% city_names)
Let’s also make a nice colour palette:
<- c( "#184E77", "#1E6091", "#168AAD", "#34A0A4", "#52B69A",
blue_yellow "#76C893", "#99D98C", "#B5E48C", "#D9ED92")
<- colorRampPalette(blue_yellow)(50)
colour_palette ::print_pal(colour_palette) feathers
Now we are ready to make maps of our results!
We’ll once again make a custom function make_the_map()
to construct each map. This function not only constructs our maps, but adds each month’s 3-letter abbreviation month_label
to the top of each map for our eventual animation.
For this, let’s make some month labels:
# Get label for each month
<- month(1:12, label = TRUE)
month_label month_label
[1] Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
And now we’ll create our make_the_map()
function:
# Map making function (for monthly SDM build this map)
<- function(model_data, month_label) {
make_the_map
<- {{month_label}}
month
ggplot() +
geom_stars(data = model_data) +
geom_sf(data = ozmaps::ozmap_states,
colour = "#A9A793",
fill = "#C8C6AF") +
coord_sf(crs = "WGS84",
xlim = c(112, 154),
ylim = c(-43, -11)) +
scale_fill_gradientn(colours = c(colour_palette),
limits = c(0, 1),
guide = guide_colourbar(
title = "Occurrence\nprobability",
title.theme = element_text(
family = "Times New Roman",
colour = "#3D4040",
size = 10,
face = "bold"),
label.theme = element_text(
colour = "#3D4040",
size = 8),
ticks = FALSE,
frame.colour = "#3D4040",
title.position = "top",
title.vjust = 2,
label.position = "left"),
breaks = c(0, 0.5, 1),
labels = c("0%", "50%", "100%")) +
# Title map with month
labs(title = glue("{month_label}")) +
theme_void() +
theme(
legend.position = c(1.2, 0.2), # reposition legend
plot.title = element_text(size = 26),
legend.direction = "vertical",
plot.background = element_rect(fill = "#FFFFFF", colour = NA),
plot.margin = unit(c(0.01, 2.5, 0.1, 0.1), "cm")) +
# Add city points
geom_point(data = cities,
aes(x = long, y = lat),
size = 2,
colour = "#782717",
fill = "white",
shape = 21) +
# Add city labels
::geom_text_repel(data = cities,
ggrepelaes(x = long, y = lat, label = name),
colour = "white",
nudge_x = .12, nudge_y = .1,
hjust = "inward",
vjust = "inward",
fontface = "bold",
size = 6.8,
family = "Times New Roman")
}
We can use map2()
to make all of our maps:
# generate maps
<- model_list %>%
all_maps map2(
.x = .,
.y = month_label,
.f = make_the_map
)
Now to save our maps. We’ll assign our plots a letter so they are ordered alphabetically and saved in order in a new folder called maps
.
To figure out the best aspect ratio to save our maps, we’ll use the get_asp_ratio()
function from the {tmaptools} package, and use it calculate the width of our plots. Finally, we can use purrr::walk2()
to loop through ggsave()
and save our maps.
# set names of plots to save
<- as.list(letters[1:12]) # saves in correct order for gif
letter_id <- purrr::map(letter_id, ~glue("maps/map_{.x}.png"))
plotnames
# save plots
<- get_asp_ratio(model_list[[1]]) # aspect ratio
plot_ratio
walk2(plotnames, all_maps, ~ggsave(filename = .x,
plot = .y,
height = 9,
width = plot_ratio*10))
We should now have 12 species distribution maps saved, and we can see them by returning all the files in our maps
folder.
<- list.files(path = "maps/")
map_files map_files
[1] "map_a.png" "map_b.png" "map_c.png" "map_d.png" "map_e.png" "map_f.png"
[7] "map_g.png" "map_h.png" "map_i.png" "map_j.png" "map_k.png" "map_l.png"
Make GIF
For context before we see our animation, let’s first look at the distribution of all nudibranchs across Australia (this uses all the same code as above, but without all the looping)
Code
## Run the 3 models on our data
<- modelling("GLM",
SDM_GLM Occurrences = occurrences_clean,
Env = aus_env,
Xcol = 'decimalLongitude',
Ycol = 'decimalLatitude',
verbose = FALSE)
<- modelling("MARS",
SDM_MARS Occurrences = occurrences_clean,
Env = aus_env,
Xcol = 'decimalLongitude',
Ycol = 'decimalLatitude',
verbose = FALSE)
<- modelling("CTA",
SDM_CTA Occurrences = occurrences_clean,
Env = aus_env,
Xcol = 'decimalLongitude',
Ycol = 'decimalLatitude',
verbose = FALSE)
<- -2 * (log(SDM_MARS@projection) + log(SDM_GLM@projection) + log(SDM_CTA@projection))
combined <- function(x){1 - pchisq(q = x, df = 6)}
Chi_sq <- raster::calc(combined, fun = Chi_sq)
combined_pval <- stars::st_as_stars(combined_pval)
species_distribution
## MAP
ggplot() +
geom_stars(data = species_distribution) + # Plot SDM results
geom_sf(data = ozmaps::ozmap_country, # Add Australian map
colour = "grey",
fill = "#C8C6AF") +
coord_sf(crs = "WGS84", # Set geographical boundaries
xlim = c(112, 154),
ylim = c(-43, -11)) +
scale_fill_gradientn(
colours = c(colour_palette), # Use custom palette
limits = c(0, 1),
guide = guide_colourbar(
title = "Occurrence probability", # title of legend
title.theme = element_text( # style legend title
family = "Times New Roman",
colour = "#B3B6B6",
face = "bold",
size = 12),
label.theme = element_text( # style legend text
colour = "#B3B6B6",
size = 10),
ticks = FALSE,
frame.colour = "#B3B6B6",
title.position = "top"),
breaks = c(0, 0.5, 1),
labels = c("0%", "50%", "100%")
+
) theme_void() +
theme(
legend.position = c(0.2, 0.1),
legend.direction = "horizontal",
legend.key.size = unit(5, "mm"),
plot.background = element_rect(fill = "#F7F7F3", color = "#F16704"),
panel.border = element_rect(color = "#FFFFFF", fill = NA, size = 2)
)
Looks like there are nudibranchs along pretty much the entire coastline of Australia!
To finish our animation, let’s stick our 12 monthly maps together with the {gifski} package:
# Create animation
gifski(glue("maps/{map_files}"), gif_file = "SDM.gif", delay = 0.5,
width = ((plot_ratio*10)*96)*.8, height = (9*96)*.8) # correct ratios
::include_graphics("SDM.gif") knitr
We now have our animated GIF! Our animation shows that nudibranchs can be observed all year long, though there are some months where you are more likely to observe nudibranchs in more places than others.
However, when data are broken down into smaller and smaller groups (which often happens over the course of an entire analysis), we increase the chance of uncertainty in our results.
Uncertainty can grow when we use fewer observations to predict our distributions because with less information, our predictions are more strongly swayed by outliers. In our case, there are more observations of nudibranchs from October-January and fewer from May-August. Although it’s very possible uncertainty had an effect on the patterns we see in our final animation, you can’t tell from seeing our animation on its own!
Final thoughts
We hope you’ve felt the thrill of constructing your own stop-motion animation with {ggplot2} and {gifski}!
If you are interested in making animations of other types of plots, check out the {gganimate} package or the {plotly} package, too!
Expand for session info
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.3.2 (2023-10-31 ucrt)
os Windows 10 x64 (build 19045)
system x86_64, mingw32
ui RTerm
language (EN)
collate English_Australia.utf8
ctype English_Australia.utf8
tz Australia/Sydney
date 2024-02-12
pandoc 3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
─ Packages ───────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
abind * 1.4-5 2016-07-21 [1] CRAN (R 4.3.1)
dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.3.2)
forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.2)
galah * 2.0.1 2024-02-06 [1] CRAN (R 4.3.2)
ggplot2 * 3.4.4 2023-10-12 [1] CRAN (R 4.3.1)
gifski * 1.12.0-2 2023-08-12 [1] CRAN (R 4.3.2)
glue * 1.6.2 2022-02-24 [1] CRAN (R 4.3.2)
htmltools * 0.5.7 2023-11-03 [1] CRAN (R 4.3.2)
knitr * 1.45 2023-10-30 [1] CRAN (R 4.3.2)
lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.3.2)
maps * 3.4.1.1 2023-11-03 [1] CRAN (R 4.3.2)
ozmaps * 0.4.5 2021-08-03 [1] CRAN (R 4.3.2)
purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.2)
readr * 2.1.4 2023-02-10 [1] CRAN (R 4.3.2)
sdmpredictors * 0.2.15 2023-08-23 [1] CRAN (R 4.3.2)
sessioninfo * 1.2.2 2021-12-06 [1] CRAN (R 4.3.2)
sf * 1.0-14 2023-07-11 [1] CRAN (R 4.3.2)
SSDM * 0.2.9 2023-10-24 [1] CRAN (R 4.3.2)
stars * 0.6-4 2023-09-11 [1] CRAN (R 4.3.2)
stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.3.2)
tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.2)
tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.3.2)
tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.2)
tmaptools * 3.1-1 2021-01-19 [1] CRAN (R 4.3.2)
[1] C:/Users/KEL329/R-packages
[2] C:/Users/KEL329/AppData/Local/Programs/R/R-4.3.2/library
──────────────────────────────────────────────────────────────────────────────