Author

Stephanie Woolley
Olivia Torresan
Dax Kellie

Date

14 March 2023

Intern Post

Each species has a habitat range where it normally lives and can expect to be found over its lifetime. However, individuals of a species rarely stay in the same spot for long periods of time. Just like us, they react to changes in their environment, interactions with other species, and interactions with other individuals.

As a result, it can be useful to see how a distribution of a species changes in space and over time. In marine environments, for example, seemingly small changes in temperature, chemicals and light can result in large changes to a species’ distribution.

Here we will map the distribution of Nudibranchia around Australia each month as an animated map to see how nudibranch distributions change over the year.

This post is inspired by Liam Bailey’s cool (and hilarious) Bigfoot distribution map. You can find his code here.

Download data

Occurrence data

Let’s first download observations of Nudibranchia across Australia.

Left: Doriprismatica atromarginata (diana88jingfung CC-BY-NC 4.0 (Int)) Middle: Ceratosoma amoenum (Erik Schlogl CC-BY-NC 4.0 (Int)) Right: Pteraeolidia ianthina (Jallitt CC-BY-NC 4.0 (Int))

We’ll load the necessary packages.

library(galah)
library(tidyverse)
library(glue)
library(lubridate)
library(stars)         # Raster management 
library(ozmaps)        # Australian map
library(SSDM)          # Linear modelling
library(sdmpredictors) # Environmental variables 
library(grDevices)     # Colours and fonts
library(maps)          # Cities for map
library(tmaptools)     # Create plot ratio
library(gifski)        # Create GIF
library(knitr)         # View GIF

Now we will use the {galah} package to download observations of Nudibranchia.

You will need to provide a registered email with the ALA to galah_config() before retrieving records.

# Add registered email (register at ala.org.au)
galah_config(email = "your-email@email.com")

# Download observations
nudibranch_occurrences <- 
  galah_call() |>                               
  galah_identify("Nudibranchia") |>   
  galah_filter(country == "Australia") |>
  galah_apply_profile(ALA) |> # ALA's set of data cleaning filters
  atlas_occurrences()

Environmental variables

Now we will download our environmental variables for our model.

For our Nudibranchia model, we will use 4 common marine environmental variables:

Sea surface temperature
Sea surface salinity
Distance to shore
Bathymetry

To get them, we’ll use load_layers() from the {sdmpredictors} package to download our variables as raster layers (geographic layers that have a value per pixel of our variable). We’ll use the rasterstack argument to combine our layers into one object.

Note

The {sdmpredictors} package has lots of data sets and layers available. Check out their website to learn more.

# Download variables
env <- load_layers(layercodes = c("MS_biogeo08_sss_mean_5m", 
                                  "MS_biogeo13_sst_mean_5m", 
                                  "MS_biogeo05_dist_shore_5m", 
                                  "MS_bathy_5m"), 
                   equalarea = FALSE, 
                   rasterstack = TRUE)

To prepare variable data for our model, we need to crop the geographical boundaries of our data to include only the coast (and surrounding ocean) of Australia. With the help of the {raster} package, we’ll use extent() to set the outer boundaries and crop() to remove the land.

# Create extent
aus_ext <- raster::extent(100, 165, -45, -10)

# Limit environmental variables
aus_env <- raster::crop(env, aus_ext) 

# Check variables 
plot(aus_env)

Prepare data

To construct our animated GIF, we can make each “frame” of our animation a species distribution map of each month - that means 12 maps, January to December.

In order to do this, we’ll create a set of custom functions that:

Filter all observations to only observations of a specific month,
Run a species distribution model on those observations
Plot the results onto a map
Save the maps

By making custom functions for these tasks, we’ll be able to run each function in a loop, letting us do each thing 12 times for each of our 12 months.

At the end, we’ll stitch our 12 maps together and, Voila! We’ll officially be animators (Pixar here we come!).

First we’ll filter out NA values and duplicates (which might cause our model to error) and extract the month of observation into its own column.

# Clean, filter and convert time series to months 
occurrences_clean <- 
  nudibranch_occurrences |> 
  filter(!is.na(decimalLatitude) & !is.na(decimalLongitude)) |>
  filter(!duplicated(decimalLatitude) & !duplicated(decimalLongitude)) |>
  mutate(month = month(eventDate)) |>
  select(month, decimalLatitude, decimalLongitude) 

head(occurrences_clean)

# A tibble: 6 × 3
  month decimalLatitude decimalLongitude
  <dbl>           <dbl>            <dbl>
1     9           -28.1             153.
2    11           -35.0             151.
3    11           -33.8             151.
4    NA           -21.8             114.
5    10           -38.4             145.
6    11           -10.5             106.

From here, we’ll make our own function make_months_df() that filters our overall observations to only those of our chosen month.

# Build function (for each month select the lat and long)
make_months_df <- function(chosen_month) {
  monthly_data <- occurrences_clean %>% 
    filter(month == {{chosen_month}}) %>%
    select(decimalLatitude, decimalLongitude)
}

With the help of purrr::map() we can run a loop over our make_months_df() function to return our 12 data.frames in one list.

n_months <- c(1:12)
month_list <- purrr::map(n_months, make_months_df)

month_list[[1]] # See output of month 1

# A tibble: 2,299 × 2
   decimalLatitude decimalLongitude
             <dbl>            <dbl>
 1           -38.3             145.
 2           -14.7             145.
 3           -30.2             153.
 4           -38.4             145.
 5           -33.1             152.
 6           -38.5             145.
 7           -34.0             151.
 8           -30.1             153.
 9           -33.4             152.
10           -21.9             114.
# ℹ 2,289 more rows

Species Distribution Model

Now that month_list contains our 12 data.frames, we can run some models with them to calculate a distribution surface.

To build our overall Species Distribution Model (SDM) we’ve chosen to use the method used by Liam Bailey in his Bigfoot map. It’s a fairly flexible model that suits our purposes to quickly see where nudibranchs are observed around Australia.

We’ll build another custom function to run these models called run_sdm_model() for each chosen month.

Note

This SDM method merges the results from several models into one final value using Fisher’s combined probability. It’s by no means the most robust SDM. If you are trying to make a more informative species distribution model, it might be worth considering other methods!

# Species distribution model function
run_sdm_model <- function(chosen_month) {
  SDM_GLM <- modelling("GLM",
    Occurrences = (chosen_month),
    Env = aus_env,
    Xcol = "decimalLongitude",
    Ycol = "decimalLatitude",
    verbose = FALSE
  )
  SDM_MARS <- modelling("MARS",
    Occurrences = (chosen_month),
    Env = aus_env,
    Xcol = "decimalLongitude",
    Ycol = "decimalLatitude",
    verbose = FALSE
  )
  SDM_CTA <- modelling("CTA",
    Occurrences = (chosen_month),
    Env = aus_env,
    Xcol = "decimalLongitude",
    Ycol = "decimalLatitude",
    verbose = FALSE
  )

  # Calculate single value using Fisher's combined probability
  combined <- -2 * (log(SDM_MARS@projection) + log(SDM_GLM@projection) + log(SDM_CTA@projection))
  Chi_sq <- function(x) {1 - pchisq(q = x, df = 6)}
  combined_pval <- raster::calc(combined, fun = Chi_sq)

  # Convert to spatial object
  species_distribution <- stars::st_as_stars(combined_pval)
  return(species_distribution)
}

Now we can use purrr::map() to run another loop to return results of our 12 SDMs.

# Run & save models
model_list <- purrr::map(month_list, run_sdm_model)

Map

We now have the results of our 12 SDMs in model_list. We can use these results to make 12 maps.

To help orient ourselves, let’s download point data of the main cities in Australia from the {maps} package.

city_names <- c("Sydney", "Melbourne", "Brisbane", "Cairns", 
                "Canberra", "Adelaide", "Melbourne", "Perth", "Darwin")

cities <- world.cities |>
  filter(country.etc == "Australia") |>
  filter(name %in% city_names)

Let’s also make a nice colour palette:

blue_yellow <- c( "#184E77", "#1E6091",  "#168AAD",  "#34A0A4",  "#52B69A", 
                  "#76C893", "#99D98C", "#B5E48C",  "#D9ED92")

colour_palette <- colorRampPalette(blue_yellow)(50)
feathers::print_pal(colour_palette)

Now we are ready to make maps of our results!

We’ll once again make a custom function make_the_map() to construct each map. This function not only constructs our maps, but adds each month’s 3-letter abbreviation month_label to the top of each map for our eventual animation.

For this, let’s make some month labels:

# Get label for each month
month_label <- month(1:12, label = TRUE)
month_label

 [1] Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

And now we’ll create our make_the_map() function:

# Map making function (for monthly SDM build this map)
make_the_map <- function(model_data, month_label) {

  month <- {{month_label}}
  
  ggplot() +
    geom_stars(data = model_data) +
    geom_sf(data = ozmaps::ozmap_states, 
            colour = "#A9A793", 
            fill = "#C8C6AF") +
    coord_sf(crs = "WGS84",
             xlim = c(112, 154), 
             ylim = c(-43, -11)) + 
    scale_fill_gradientn(colours = c(colour_palette),
                         limits = c(0, 1),
                         guide = guide_colourbar(
                           title = "Occurrence\nprobability",
                           title.theme = element_text(
                             family = "Times New Roman",
                             colour = "#3D4040",
                             size = 10,
                             face = "bold"),
                           label.theme = element_text(
                             colour = "#3D4040",
                             size = 8),
                           ticks = FALSE,
                           frame.colour = "#3D4040",
                           title.position = "top",
                           title.vjust = 2,
                           label.position = "left"),
                         breaks = c(0, 0.5, 1),
                         labels = c("0%", "50%", "100%")) +
    # Title map with month
    labs(title = glue("{month_label}")) + 
    theme_void() +
    theme(
      legend.position = c(1.2, 0.2), # reposition legend
      plot.title = element_text(size = 26),
      legend.direction = "vertical",
      plot.background = element_rect(fill = "#FFFFFF", colour = NA),
      plot.margin = unit(c(0.01, 2.5, 0.1, 0.1), "cm")) +
    # Add city points
    geom_point(data = cities, 
               aes(x = long, y = lat), 
               size = 2,
               colour = "#782717",
               fill = "white",
               shape = 21) +
    # Add city labels
    ggrepel::geom_text_repel(data = cities, 
              aes(x = long, y = lat, label = name), 
              colour = "white",
              nudge_x = .12, nudge_y = .1, 
              hjust = "inward",
              vjust = "inward",
              fontface = "bold",
              size = 6.8,
              family = "Times New Roman")
}

We can use map2() to make all of our maps:

# generate maps
all_maps <- model_list %>%
  map2(
    .x = .,
    .y = month_label,
    .f = make_the_map
  )

Now to save our maps. We’ll assign our plots a letter so they are ordered alphabetically and saved in order in a new folder called maps.

To figure out the best aspect ratio to save our maps, we’ll use the get_asp_ratio() function from the {tmaptools} package, and use it calculate the width of our plots. Finally, we can use purrr::walk2() to loop through ggsave() and save our maps.

# set names of plots to save
letter_id <- as.list(letters[1:12]) # saves in correct order for gif
plotnames <- purrr::map(letter_id, ~glue("maps/map_{.x}.png")) 

# save plots
plot_ratio <- get_asp_ratio(model_list[[1]]) # aspect ratio

walk2(plotnames, all_maps, ~ggsave(filename = .x, 
                                   plot = .y, 
                                   height = 9, 
                                   width = plot_ratio*10))

We should now have 12 species distribution maps saved, and we can see them by returning all the files in our maps folder.

map_files <- list.files(path = "maps/")
map_files

 [1] "map_a.png" "map_b.png" "map_c.png" "map_d.png" "map_e.png" "map_f.png"
 [7] "map_g.png" "map_h.png" "map_i.png" "map_j.png" "map_k.png" "map_l.png"

Make GIF

For context before we see our animation, let’s first look at the distribution of all nudibranchs across Australia (this uses all the same code as above, but without all the looping)

Code

## Run the 3 models on our data
SDM_GLM <- modelling("GLM",
                     Occurrences = occurrences_clean,
                     Env = aus_env,
                     Xcol = 'decimalLongitude', 
                     Ycol = 'decimalLatitude', 
                     verbose = FALSE) 
SDM_MARS <- modelling("MARS",
                      Occurrences = occurrences_clean,
                      Env = aus_env,
                      Xcol = 'decimalLongitude', 
                      Ycol = 'decimalLatitude', 
                      verbose = FALSE)
SDM_CTA <- modelling("CTA",
                     Occurrences = occurrences_clean,
                     Env = aus_env,
                     Xcol = 'decimalLongitude', 
                     Ycol = 'decimalLatitude', 
                     verbose = FALSE)

combined <- -2 * (log(SDM_MARS@projection) + log(SDM_GLM@projection) + log(SDM_CTA@projection))
Chi_sq <- function(x){1 - pchisq(q = x, df = 6)}
combined_pval <- raster::calc(combined, fun = Chi_sq) 
species_distribution <- stars::st_as_stars(combined_pval) 


## MAP 

ggplot() +
  geom_stars(data = species_distribution) + # Plot SDM results
  geom_sf(data = ozmaps::ozmap_country, # Add Australian map
          colour = "grey", 
          fill = "#C8C6AF") + 
  coord_sf(crs = "WGS84", # Set geographical boundaries
           xlim = c(112, 154), 
           ylim = c(-43, -11)) + 
  scale_fill_gradientn(
    colours = c(colour_palette), # Use custom palette
    limits = c(0, 1),
    guide = guide_colourbar(
      title = "Occurrence probability", # title of legend
      title.theme = element_text( # style legend title
        family = "Times New Roman", 
        colour = "#B3B6B6",
        face = "bold",
        size = 12),
      label.theme = element_text( # style legend text
        colour = "#B3B6B6", 
        size = 10),
      ticks = FALSE,
      frame.colour = "#B3B6B6",
      title.position = "top"),
    breaks = c(0, 0.5, 1),
    labels = c("0%", "50%", "100%")
  ) + 
  theme_void() +
  theme(
    legend.position = c(0.2, 0.1),
    legend.direction = "horizontal",
    legend.key.size = unit(5, "mm"),
    plot.background = element_rect(fill = "#F7F7F3", color = "#F16704"),
    panel.border = element_rect(color = "#FFFFFF", fill = NA, size = 2)
    )

Looks like there are nudibranchs along pretty much the entire coastline of Australia!

To finish our animation, let’s stick our 12 monthly maps together with the {gifski} package:

# Create animation
gifski(glue("maps/{map_files}"), gif_file = "SDM.gif", delay = 0.5, 
       width = ((plot_ratio*10)*96)*.8, height = (9*96)*.8) # correct ratios

knitr::include_graphics("SDM.gif")

We now have our animated GIF! Our animation shows that nudibranchs can be observed all year long, though there are some months where you are more likely to observe nudibranchs in more places than others.

However, when data are broken down into smaller and smaller groups (which often happens over the course of an entire analysis), we increase the chance of uncertainty in our results.

Uncertainty can grow when we use fewer observations to predict our distributions because with less information, our predictions are more strongly swayed by outliers. In our case, there are more observations of nudibranchs from October-January and fewer from May-August. Although it’s very possible uncertainty had an effect on the patterns we see in our final animation, you can’t tell from seeing our animation on its own!

Final thoughts

We hope you’ve felt the thrill of constructing your own stop-motion animation with {ggplot2} and {gifski}!

If you are interested in making animations of other types of plots, check out the {gganimate} package or the {plotly} package, too!

Expand for session info

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_Australia.utf8
 ctype    English_Australia.utf8
 tz       Australia/Sydney
 date     2024-02-12
 pandoc   3.1.1 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package       * version  date (UTC) lib source
 abind         * 1.4-5    2016-07-21 [1] CRAN (R 4.3.1)
 dplyr         * 1.1.4    2023-11-17 [1] CRAN (R 4.3.2)
 forcats       * 1.0.0    2023-01-29 [1] CRAN (R 4.3.2)
 galah         * 2.0.1    2024-02-06 [1] CRAN (R 4.3.2)
 ggplot2       * 3.4.4    2023-10-12 [1] CRAN (R 4.3.1)
 gifski        * 1.12.0-2 2023-08-12 [1] CRAN (R 4.3.2)
 glue          * 1.6.2    2022-02-24 [1] CRAN (R 4.3.2)
 htmltools     * 0.5.7    2023-11-03 [1] CRAN (R 4.3.2)
 knitr         * 1.45     2023-10-30 [1] CRAN (R 4.3.2)
 lubridate     * 1.9.3    2023-09-27 [1] CRAN (R 4.3.2)
 maps          * 3.4.1.1  2023-11-03 [1] CRAN (R 4.3.2)
 ozmaps        * 0.4.5    2021-08-03 [1] CRAN (R 4.3.2)
 purrr         * 1.0.2    2023-08-10 [1] CRAN (R 4.3.2)
 readr         * 2.1.4    2023-02-10 [1] CRAN (R 4.3.2)
 sdmpredictors * 0.2.15   2023-08-23 [1] CRAN (R 4.3.2)
 sessioninfo   * 1.2.2    2021-12-06 [1] CRAN (R 4.3.2)
 sf            * 1.0-14   2023-07-11 [1] CRAN (R 4.3.2)
 SSDM          * 0.2.9    2023-10-24 [1] CRAN (R 4.3.2)
 stars         * 0.6-4    2023-09-11 [1] CRAN (R 4.3.2)
 stringr       * 1.5.1    2023-11-14 [1] CRAN (R 4.3.2)
 tibble        * 3.2.1    2023-03-20 [1] CRAN (R 4.3.2)
 tidyr         * 1.3.0    2023-01-24 [1] CRAN (R 4.3.2)
 tidyverse     * 2.0.0    2023-02-22 [1] CRAN (R 4.3.2)
 tmaptools     * 3.1-1    2021-01-19 [1] CRAN (R 4.3.2)

 [1] C:/Users/KEL329/R-packages
 [2] C:/Users/KEL329/AppData/Local/Programs/R/R-4.3.2/library

──────────────────────────────────────────────────────────────────────────────