Knowing what species inhabit an area is important for conservation and ecosystem management. In particular, it can help us find how many known species are in a given area, and whether any species are vulnerable or endangered.
In this post, we will present two options, one using the galah package, the other using an external shapefile and list. Using either workflow, we will show you how to download a list of species within a Local Government Area (Shoalhaven, NSW), cross-reference this list with a state conservation status list, and visualise the number of threatened species in the region with waffle and ggplot2.
Let’s first load our packages. To download species lists, you will also need to enter a registered email with the ALA using galah_config().
library(tidyverse)
library(readxl)
library(sf)
library(rmapshaper)
library(here)
library(pilot) # remotes::install_github("olihawkins/pilot")
library(showtext)
library(galah)
galah_config(email = "your-email-here") # ALA-registered emailDownload threatened species in an area
Choose which method you would like to view:
- galah (using fieldsdownloaded from the Atlas of Living Australia)
- Downloaded shapefile + species list
The method you choose depends on whether the region or list you wish to return species for is already in galah, or whether you wish to filter for a more specific area defined by a separate shapefile or list. Keep in mind that using an external list may require additional work matching taxonomic names.
Search for fields
To find what exists in galah to help us narrow our query, we can use search_all() to search for available fields. A field in galah refers to a column or layer stored in a living atlas. Let’s do a text search to find what fields contain information on “Local Government Areas”.
search_all(fields, "Local Government Areas")# A tibble: 4 × 3
  id       description                                      type  
  <chr>    <chr>                                            <chr> 
1 cl10923  PSMA Local Government Areas (2018)               fields
2 cl110923 PSMA Local Government Areas - Abbreviated (2018) fields
3 cl11170  Local Government Areas 2023                      fields
4 cl959    Local Government Areas                           fieldsThe field cl111701 contains the most recent available data (from 2023). We can preview what values are within field cl11170 using show_values().
search_all(fields, "cl11170") |>
  show_values()• Showing values for 'cl11170'.# A tibble: 547 × 1
   cl11170           
   <chr>             
 1 Unincorporated ACT
 2 Brisbane          
 3 Greater Geelong   
 4 East Gippsland    
 5 Moreton Bay       
 6 Unincorporated SA 
 7 Yarra Ranges      
 8 Sunshine Coast    
 9 Cairns            
10 Mareeba           
# ℹ 537 more rowsThere are lots of Local Government Areas! To check whether Shoalhaven is included, we can do a text search for values that match “shoalhaven”.
search_all(fields, "cl11170") |>
  search_values("shoalhaven")• Showing values for 'cl11170'.# A tibble: 1 × 1
  cl11170   
  <chr>     
1 ShoalhavenDownload data
Using the field and value returned above, we can now build our query. We begin our query with galah_call() and filter to only Shoalhaven in the year 2024. Ending our query with atlas_species() will return a list of species.
species_shoal <- galah_call() |>
  filter(cl11170 == "Shoalhaven",
         year == 2024) |>
  atlas_species()
species_shoal# A tibble: 2,936 × 11
   taxon_concept_id       species_name scientific_name_auth…¹ taxon_rank kingdom
   <chr>                  <chr>        <chr>                  <chr>      <chr>  
 1 https://biodiversity.… Gymnorhina … (Latham, 1801)         species    Animal…
 2 https://biodiversity.… Malurus (Ma… (Ellis, 1782)          species    Animal…
 3 https://biodiversity.… Vanellus (L… (Boddaert, 1783)       species    Animal…
 4 https://biodiversity.… Macropus gi… Shaw, 1790             species    Animal…
 5 https://biodiversity.… Corvus coro… Vigors & Horsfield, 1… species    Animal…
 6 https://biodiversity.… Anthochaera… (Latham, 1801)         species    Animal…
 7 https://biodiversity.… Dacelo (Dac… (Hermann, 1783)        species    Animal…
 8 https://biodiversity.… Potorous tr… (Kerr, 1792)           species    Animal…
 9 https://biodiversity.… Chroicoceph… (Stephens, 1826)       species    Animal…
10 https://biodiversity.… Grallina cy… (Latham, 1801)         species    Animal…
# ℹ 2,926 more rows
# ℹ abbreviated name: ¹scientific_name_authorship
# ℹ 6 more variables: phylum <chr>, class <chr>, order <chr>, family <chr>,
#   genus <chr>, vernacular_name <chr>atlas_species() returns taxonomic information at the species level (for more info, see the tab below). To make sure we return taxonomic information at the lowest level each occurrence was identified, we’ll group_by(taxonConceptID), which is a unique ID attached to each occurrence record’s taxonomic identification (read the box below for more on what this means).
By default atlas_species() only returns taxonomic information at the species level. This means that if some species are identified to subspecies on a specific list like the NSW Conservation Status list, atlas_species() will return the species-level match, rather than the subspecies-level match. For example, the name "Potorous tridactylus" is returned instead of "Potorous tridactylus tridactylus".
Grouping by taxonConceptID like we do below specifies that we wish to match to the identified taxon, rather than only to the species level.
species_shoal <- galah_call() |>
  filter(cl11170 == "Shoalhaven",
         year == 2024) |>
  group_by(taxonConceptID) |>
  atlas_species()
species_shoal# A tibble: 4,282 × 11
   taxon_concept_id       species_name scientific_name_auth…¹ taxon_rank kingdom
   <chr>                  <chr>        <chr>                  <chr>      <chr>  
 1 https://biodiversity.… Gymnorhina … (Latham, 1801)         species    Animal…
 2 https://biodiversity.… Malurus (Ma… (Ellis, 1782)          species    Animal…
 3 https://biodiversity.… Macropus gi… Shaw, 1790             species    Animal…
 4 https://biodiversity.… Corvus coro… Vigors & Horsfield, 1… species    Animal…
 5 https://biodiversity.… Trichogloss… Stephens, 1826         genus      Animal…
 6 https://biodiversity.… Vanellus (L… (Boddaert, 1783)       species    Animal…
 7 https://biodiversity.… Anthochaera… (Latham, 1801)         species    Animal…
 8 https://biodiversity.… Dacelo (Dac… (Hermann, 1783)        species    Animal…
 9 https://biodiversity.… Potorous tr… (McCoy, 1865)          subspecies Animal…
10 https://biodiversity.… Chroicoceph… (Stephens, 1826)       species    Animal…
# ℹ 4,272 more rows
# ℹ abbreviated name: ¹scientific_name_authorship
# ℹ 6 more variables: phylum <chr>, class <chr>, order <chr>, family <chr>,
#   genus <chr>, vernacular_name <chr>It’s also possible to return number of observations by ending our query with atlas_counts(). In this case, we can group by scientificName (the name of the lowest level the observation was identified).
galah_call() |>
  filter(cl11170 == "Shoalhaven",
         year == 2024) |>
  group_by(scientificName) |>
  atlas_counts()# A tibble: 4,282 × 2
   scientificName                      count
   <chr>                               <int>
 1 Gymnorhina tibicen                    934
 2 Malurus (Malurus) cyaneus             926
 3 Macropus giganteus                    888
 4 Corvus coronoides                     862
 5 Trichoglossus                         846
 6 Vanellus (Lobipluvia) miles           845
 7 Anthochaera (Anellobia) chrysoptera   820
 8 Dacelo (Dacelo) novaeguineae          817
 9 Potorous tridactylus trisulcatus      806
10 Chroicocephalus novaehollandiae       789
# ℹ 4,272 more rowsCross-reference with threatened species lists
Next we’ll compare our Shoalhaven species list species_shoal with a state-wide conservation status list. We can use galah to access lists that are available on the Atlas of Living Australia. Shoalhaven is within the state of New South Wales, so let’s search for “New South Wales” to see what state-specific lists are available.
search_all(lists, "New South Wales")# A tibble: 2 × 22
  species_list_uid listName         description listType dateCreated lastUpdated
  <chr>            <chr>            <chr>       <chr>    <chr>       <chr>      
1 dr650            New South Wales… "Classific… CONSERV… 2015-04-04… 2025-07-08…
2 dr487            New South Wales… "The NSW G… SENSITI… 2013-06-20… 2025-07-08…
# ℹ 16 more variables: lastUploaded <chr>, lastMatched <chr>, username <chr>,
#   itemCount <int>, region <chr>, isAuthoritative <lgl>, isInvasive <lgl>,
#   isThreatened <lgl>, isBIE <lgl>, isSDS <lgl>, wkt <chr>, category <chr>,
#   generalisation <chr>, authority <chr>, sdsType <chr>, looseSearch <lgl>Two lists are returned, and both appear relevant. With the help of some additional columns returned by search_all()—listType, isAuthoritative and isThreatened—we can learn more about which list suits our needs best. Although both lists are authoritative, only one list (dr650) contains threatened species whereas the other (dr487) contains sensitive species.
search_all(lists, "New South Wales") |>
  select(species_list_uid, listType, isAuthoritative, isThreatened)# A tibble: 2 × 4
  species_list_uid listType          isAuthoritative isThreatened
  <chr>            <chr>             <lgl>           <lgl>       
1 dr650            CONSERVATION_LIST TRUE            TRUE        
2 dr487            SENSITIVE_LIST    TRUE            FALSE       By specifying the ID dr650 and using show_values(), we can view the complete New South Wales threatened species list.
search_all(lists, "dr650") |> 
  show_values()• Showing values for 'dr650'.# A tibble: 1,064 × 6
        id name                  commonName scientificName lsid  dataResourceUid
     <int> <chr>                 <chr>      <chr>          <chr> <chr>          
 1 6791272 Delma impar           Striped L… Delma impar    http… dr650          
 2 6790725 Callocephalon fimbri… Gang-gang… Callocephalon… http… dr650          
 3 6790769 Cacophis harriettae   White-cro… Cacophis harr… http… dr650          
 4 6791482 Litoria booroolongen… Booroolon… Litoria booro… http… dr650          
 5 6790526 Anthochaera phrygia   Regent Ho… Anthochaera (… http… dr650          
 6 6791456 Calidris tenuirostris Great Knot Calidris (Cal… http… dr650          
 7 6790500 Neochmia ruficauda    Star Finch Neochmia (Neo… http… dr650          
 8 6790752 Uvidicolus sphyrurus  Border Th… Uvidicolus sp… http… dr650          
 9 6791291 Amaurornis moluccana  Pale-vent… Amaurornis mo… http… dr650          
10 6791135 Phascogale tapoatafa  Brush-tai… Phascogale ta… http… dr650          
# ℹ 1,054 more rowsAs of galah version 2.1.2, we can also use show_values() to add conservation status columns to our species list. By adding the argument all_fields = TRUE, we can add any columns stored in the ALA from the original list. For conservation lists, this includes columns like status, sourceStatus and IUCN_Status.
nsw_threatened <- search_all(lists, "dr650") |>
  show_values(all_fields = TRUE)
nsw_threatened |>
  # reposition cols
  select(status, sourceStatus, IUCN_equivalent_status, 
         scientificName, everything()) # A tibble: 1,064 × 13
   status        sourceStatus IUCN_equivalent_status scientificName     id name 
   <chr>         <chr>        <chr>                  <chr>           <int> <chr>
 1 Vulnerable    Vulnerable   Vulnerable             Delma impar    6.79e6 Delm…
 2 Endangered    Endangered   Endangered             Callocephalon… 6.79e6 Call…
 3 Vulnerable    Vulnerable   Vulnerable             Cacophis harr… 6.79e6 Caco…
 4 Endangered    Endangered   Endangered             Litoria booro… 6.79e6 Lito…
 5 Critically E… Critically … Critically Endangered  Anthochaera (… 6.79e6 Anth…
 6 Vulnerable    Vulnerable   Vulnerable             Calidris (Cal… 6.79e6 Cali…
 7 Extinct       Extinct      Extinct                Neochmia (Neo… 6.79e6 Neoc…
 8 Vulnerable    Vulnerable   Vulnerable             Uvidicolus sp… 6.79e6 Uvid…
 9 Vulnerable    Vulnerable   Vulnerable             Amaurornis mo… 6.79e6 Amau…
10 Vulnerable    Vulnerable   Vulnerable             Phascogale ta… 6.79e6 Phas…
# ℹ 1,054 more rows
# ℹ 7 more variables: commonName <chr>, lsid <chr>, dataResourceUid <chr>,
#   raw_scientificName <chr>, vernacularName <chr>, rank <chr>, family <chr>Adding status info can be handy if we want to join this with other information like record counts.
# get record counts for each species on the NSW Conservation Status list
threatened_counts <- galah_call() |>
  galah_filter(species_list_uid == dr650,
               cl11170 == "Shoalhaven",
               year == 2024) |>
  galah_group_by(scientificName) |>
  atlas_counts()
threatened_counts# A tibble: 94 × 2
   scientificName                                    count
   <chr>                                             <int>
 1 Potorous tridactylus trisulcatus                    806
 2 Haematopus longirostris                             542
 3 Haliaeetus (Pontoaetus) leucogaster                 343
 4 Haematopus fuliginosus                              221
 5 Sternula albifrons                                  208
 6 Numenius (Numenius) madagascariensis                197
 7 Calyptorhynchus (Calyptorhynchus) lathami lathami   124
 8 Dasyurus maculatus                                  111
 9 Callocephalon fimbriatum                             93
10 Esacus magnirostris                                  63
# ℹ 84 more rows# join counts to status information
threatened_counts_joined <-
  threatened_counts |> 
  left_join(nsw_threatened,
            join_by(scientificName == scientificName)) |>
  # reposition cols
  select(scientificName, count, status, commonName, everything())
threatened_counts_joined# A tibble: 94 × 14
   scientificName     count status commonName     id name  lsid  dataResourceUid
   <chr>              <int> <chr>  <chr>       <int> <chr> <chr> <chr>          
 1 Potorous tridacty…   806 Vulne… Long-nose… 6.79e6 Poto… http… dr650          
 2 Haematopus longir…   542 Endan… Australia… 6.79e6 Haem… http… dr650          
 3 Haliaeetus (Ponto…   343 Vulne… White-bel… 6.79e6 Hali… http… dr650          
 4 Haematopus fuligi…   221 Vulne… Sooty Oys… 6.79e6 Haem… http… dr650          
 5 Sternula albifrons   208 Endan… Little Te… 6.79e6 Ster… http… dr650          
 6 Numenius (Numeniu…   197 Criti… Eastern C… 6.79e6 Nume… http… dr650          
 7 Calyptorhynchus (…   124 Vulne… South-eas… 6.79e6 Caly… http… dr650          
 8 Dasyurus maculatus   111 Vulne… Bindjulang 6.79e6 Dasy… http… dr650          
 9 Callocephalon fim…    93 Endan… Gang-gang… 6.79e6 Call… http… dr650          
10 Esacus magnirostr…    63 Criti… Beach Sto… 6.79e6 Esac… http… dr650          
# ℹ 84 more rows
# ℹ 6 more variables: raw_scientificName <chr>, vernacularName <chr>,
#   rank <chr>, family <chr>, sourceStatus <chr>, IUCN_equivalent_status <chr>To return which species on the New South Wales Conservation Status List (dr650) were recorded in Shoalhaven in 2024, we can add species_list_uid == dr650 as a filter to a query ending with atlas_species(). To make sure we return taxonomic information at the lowest level each occurrence was identified, we’ll group_by(taxonConceptID).
threatened <- galah_call() |>
  galah_filter(cl11170 == "Shoalhaven",
               year == 2024,
               species_list_uid == dr650) |>
  group_by(taxonConceptID) |>
  atlas_species()
threatened# A tibble: 94 × 11
   taxon_concept_id       species_name scientific_name_auth…¹ taxon_rank kingdom
   <chr>                  <chr>        <chr>                  <chr>      <chr>  
 1 https://biodiversity.… Potorous tr… (McCoy, 1865)          subspecies Animal…
 2 https://biodiversity.… Haematopus … Vieillot, 1817         species    Animal…
 3 https://biodiversity.… Haliaeetus … (Gmelin, 1788)         species    Animal…
 4 https://biodiversity.… Haematopus … Gould, 1845            species    Animal…
 5 https://biodiversity.… Sternula al… (Pallas, 1764)         species    Animal…
 6 https://biodiversity.… Numenius (N… (Linnaeus, 1766)       species    Animal…
 7 https://biodiversity.… Calyptorhyn… (Temminck, 1807)       subspecies Animal…
 8 https://biodiversity.… Dasyurus ma… (Kerr, 1792)           species    Animal…
 9 https://biodiversity.… Callocephal… (Grant, 1803)          species    Animal…
10 https://biodiversity.… Esacus magn… Vieillot, 1818         species    Animal…
# ℹ 84 more rows
# ℹ abbreviated name: ¹scientific_name_authorship
# ℹ 6 more variables: phylum <chr>, class <chr>, order <chr>, family <chr>,
#   genus <chr>, vernacular_name <chr>Note that status information is not included in the query above, but can be joined in the same way we added this status information to threatened_counts2.
# select status columns, join status information
threatened_status <- 
  nsw_threatened |>
  select(scientificName, status, sourceStatus, IUCN_equivalent_status) |> 
  right_join(threatened,
            join_by(scientificName == species_name)) |>
  # reposition cols
  select(scientificName, status, sourceStatus, everything())
threatened_status# A tibble: 94 × 14
   scientificName    status sourceStatus IUCN_equivalent_status taxon_concept_id
   <chr>             <chr>  <chr>        <chr>                  <chr>           
 1 Callocephalon fi… Endan… Endangered   Endangered             https://biodive…
 2 Ninox (Rhabdogla… Vulne… Vulnerable   Vulnerable             https://biodive…
 3 Limosa limosa     Vulne… Vulnerable   Vulnerable             https://biodive…
 4 Numenius (Numeni… Criti… Critically … Critically Endangered  https://biodive…
 5 Hirundapus cauda… Vulne… Vulnerable   Vulnerable             https://biodive…
 6 Haliaeetus (Pont… Vulne… Vulnerable   Vulnerable             https://biodive…
 7 Tyto tenebricosa  Vulne… Vulnerable   Vulnerable             https://biodive…
 8 Chalinolobus dwy… Endan… Endangered   Endangered             https://biodive…
 9 Ixobrychus flavi… Vulne… Vulnerable   Vulnerable             https://biodive…
10 Hoplocephalus bu… Endan… Endangered   Endangered             https://biodive…
# ℹ 84 more rows
# ℹ 9 more variables: scientific_name_authorship <chr>, taxon_rank <chr>,
#   kingdom <chr>, phylum <chr>, class <chr>, order <chr>, family <chr>,
#   genus <chr>, vernacular_name <chr>Download shapefile
To retrieve the spatial outline of Shoalhaven, let’s download the latest Local Government Areas data from the Australian Bureau of Statistics Digital Boundary files page. Find “Local Government Areas - 2023 - Shapefile” and click “Download ZIP”. Save the zip folder in your current directory and unzip it.
Let’s read the file into R. We will also simplify the shapefile3 using ms_simplify() from the rmapshaper package because complex shapefiles can sometimes cause problems with sending queries to the ALA.
lga <- sf::st_read(here("LGA_2023_AUST_GDA2020.shp")) |>
  rmapshaper::ms_simplify(keep = 0.01)
lgaSimple feature collection with 544 features and 8 fields
Geometry type: GEOMETRY
Dimension:     XY
Bounding box:  xmin: 105.5335 ymin: -43.6331 xmax: 167.9969 ymax: -9.229273
Geodetic CRS:  GDA2020
First 10 features:
   LGA_CODE23    LGA_NAME23 AUS_CODE21 STE_CODE21      STE_NAME21   AREASQKM
1       10050        Albury        AUS          1 New South Wales   305.6386
2       10180      Armidale        AUS          1 New South Wales  7809.4406
3       10250       Ballina        AUS          1 New South Wales   484.9692
4       10300     Balranald        AUS          1 New South Wales 21690.7493
5       10470      Bathurst        AUS          1 New South Wales  3817.8645
6       10500 Bayside (NSW)        AUS          1 New South Wales    50.6204
7       10550   Bega Valley        AUS          1 New South Wales  6278.5013
8       10600     Bellingen        AUS          1 New South Wales  1600.4338
9       10650      Berrigan        AUS          1 New South Wales  2065.8878
10      10750     Blacktown        AUS          1 New South Wales   238.8471
   AUS_NAME21                                               LOCI_URI21
1   Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10050
2   Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10180
3   Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10250
4   Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10300
5   Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10470
6   Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10500
7   Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10550
8   Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10600
9   Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10650
10  Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/10750
                         geometry
1  POLYGON ((146.8177 -36.0673...
2  POLYGON ((152.2957 -30.9310...
3  POLYGON ((153.4496 -28.7550...
4  POLYGON ((143.5525 -33.1404...
5  POLYGON ((149.3947 -33.9975...
6  POLYGON ((151.155 -33.92618...
7  POLYGON ((149.9762 -37.5051...
8  POLYGON ((152.8035 -30.1895...
9  POLYGON ((145.4845 -35.5119...
10 POLYGON ((150.8129 -33.8223...Now let’s transform our shapefile to use the Coordinate Reference System (CRS) EPSG:4326 (the standard used in cartography and GPS, also known as WGS84) so that it matches the projection of our data from the ALA 4.
lga <- lga |>
  st_transform(crs = 4326)Next we’ll filter our shapefile to Shoalhaven. The column LGA_NAME23 contains area names, and we can filter our data frame to only rows where LGA_NAME23 is equal to Shoalhaven. We are left with a single polygon shape of Shoalhaven.
shoalhaven_sf <- lga |>
  filter(LGA_NAME23 == "Shoalhaven")
shoalhaven_sfSimple feature collection with 1 feature and 8 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 149.9774 ymin: -35.64458 xmax: 150.8494 ymax: -34.65044
Geodetic CRS:  WGS 84
  LGA_CODE23 LGA_NAME23 AUS_CODE21 STE_CODE21      STE_NAME21 AREASQKM
1      16950 Shoalhaven        AUS          1 New South Wales 4567.201
  AUS_NAME21                                               LOCI_URI21
1  Australia https://linked.data.gov.au/dataset/asgsed3/LGA2023/16950
                        geometry
1 POLYGON ((150.7813 -34.7921...Download data
Now that shoalhaven_sf contains our LGA shape, we can build our query. Once again, we’ll begin with galah_call() and filter to only records from 2024. We can specify that we want records within shoalhaven_sf using geolocate(). To make sure we return taxonomic information at the level occurrences were identified to, we’ll group_by(taxonConceptID), which is a unique ID attached to each occurrence record’s taxonomic identification (read the box below for more on what this means). Finally, we can return a species list by ending our query with atlas_species().
By default atlas_species() only returns taxonomic information at the species level. This means that if some species are identified to subspecies on a specific list like the NSW Conservation Status list, atlas_species() will return the species-level match, rather than the subspecies-level match. For example, the name "Potorous tridactylus" is returned instead of "Potorous tridactylus tridactylus".
Grouping by taxonConceptID like we do below specifies that we wish to match to the identified taxon, rather than only to the species level.
species_shoal <- galah_call() |>
  filter(year == 2024) |>
  geolocate(shoalhaven_sf) |>
  group_by(taxonConceptID) |>
  atlas_species()
species_shoal# A tibble: 4,459 × 11
   taxon_concept_id       species_name scientific_name_auth…¹ taxon_rank kingdom
   <chr>                  <chr>        <chr>                  <chr>      <chr>  
 1 https://biodiversity.… Malurus (Ma… (Ellis, 1782)          species    Animal…
 2 https://biodiversity.… Gymnorhina … (Latham, 1801)         species    Animal…
 3 https://biodiversity.… Macropus gi… Shaw, 1790             species    Animal…
 4 https://biodiversity.… Corvus coro… Vigors & Horsfield, 1… species    Animal…
 5 https://biodiversity.… Vanellus (L… (Boddaert, 1783)       species    Animal…
 6 https://biodiversity.… Trichogloss… Stephens, 1826         genus      Animal…
 7 https://biodiversity.… Potorous tr… (McCoy, 1865)          subspecies Animal…
 8 https://biodiversity.… Anthochaera… (Latham, 1801)         species    Animal…
 9 https://biodiversity.… Dacelo (Dac… (Hermann, 1783)        species    Animal…
10 https://biodiversity.… Chroicoceph… (Stephens, 1826)       species    Animal…
# ℹ 4,449 more rows
# ℹ abbreviated name: ¹scientific_name_authorship
# ℹ 6 more variables: phylum <chr>, class <chr>, order <chr>, family <chr>,
#   genus <chr>, vernacular_name <chr>It’s also possible to return the observations counts by ending our query with atlas_counts(). In this case, we can group by scientificName (the name of the lowest level the observation was identified).
galah_call() |>
  filter(year == 2024) |>
  geolocate(shoalhaven_sf) |>
  group_by(scientificName) |>
  atlas_counts()# A tibble: 4,459 × 2
   scientificName                      count
   <chr>                               <int>
 1 Malurus (Malurus) cyaneus             931
 2 Gymnorhina tibicen                    923
 3 Macropus giganteus                    889
 4 Corvus coronoides                     856
 5 Vanellus (Lobipluvia) miles           847
 6 Trichoglossus                         845
 7 Potorous tridactylus trisulcatus      818
 8 Anthochaera (Anellobia) chrysoptera   815
 9 Dacelo (Dacelo) novaeguineae          801
10 Chroicocephalus novaehollandiae       795
# ℹ 4,449 more rowsUse external list
We can use our own conservation status lists from an external source to compare to our Shoalhaven species list. As an example, we are using the the New South Wales Conservation Status List downloaded from the NSW Bionet Atlas website5.
We downloaded this list on 2025/07/08. To download a complete NSW threatened species list, we selected the following options:
- Which species or group? All entities
- Legal status? Select records that fall under one or more categories ➝ Threatened NSW
- What area? Entire area
- Period of records? All records
- Status? Valid records only
Save the downloaded .xls file in your working directory. We’ll read in our .xls file, which we have renamed to nsw_threatened.xls.
nsw_threatened_list <- readxl::read_excel(here("path", "to", "nsw_threatened.xls"), 
                                      skip = 3) # skip first 3 rowsIt’s possible you might receive the following error.
Error:
  filepath: [path-to-file]
  libxls error: Unable to open fileThis relates to a formatting issue preventing read_excel() from reading the file correctly, which seems related to the way BioNet saves its files. To fix this issue, open the list file on your computer, then re-save the file as a .xlsx document (File ➝ Save As ➝ select file format.xlsx ➝ Save). Then you can use read_excel() to read the new file in.
nsw_threatened_list <- readxl::read_excel(here("path", "to", "nsw_threatened.xlsx"), 
                                      skip = 3) # skip first 3 rowsCross-reference with threatened species lists
First we’ll clean the column names to make them easier to use in R using the amazing function janitor::clean_names(). We also need to remove the ^ that precedes some names on the list.
nsw_threatened_list <- nsw_threatened_list |>
  janitor::clean_names() |>
  mutate(
    scientific_name = stringr::str_remove_all(scientific_name, "\\^")
  )
nsw_threatened_list# A tibble: 1,218 × 11
   kingdom  class    family      species_code scientific_name exotic common_name
   <chr>    <chr>    <chr>       <chr>        <chr>           <lgl>  <chr>      
 1 Animalia Amphibia Myobatrach… 3007         Assa darlingto… NA     Pouched Fr…
 2 Animalia Amphibia Myobatrach… 3135         Crinia sloanei  NA     Sloane's F…
 3 Animalia Amphibia Myobatrach… 3137         Crinia tinnula  NA     Wallum Fro…
 4 Animalia Amphibia Myobatrach… 3073         Mixophyes balb… NA     Stuttering…
 5 Animalia Amphibia Myobatrach… 3008         Mixophyes flea… NA     Fleay's Ba…
 6 Animalia Amphibia Myobatrach… 3075         Mixophyes iter… NA     Giant Barr…
 7 Animalia Amphibia Myobatrach… 3116         Pseudophryne a… NA     Red-crowne…
 8 Animalia Amphibia Myobatrach… 3119         Pseudophryne c… NA     Southern C…
 9 Animalia Amphibia Myobatrach… 3306         Pseudophryne p… NA     Northern C…
10 Animalia Amphibia Myobatrach… 3932         Uperoleia maho… NA     Mahony's T…
# ℹ 1,208 more rows
# ℹ 4 more variables: nsw_status <chr>, comm_status <chr>, records <chr>,
#   info <lgl>Now we can filter our Shoalhaven list to only those that match names in nsw_threatened_list.
threatened_filter <- species_shoal |>
  filter(species_name %in% nsw_threatened_list$scientific_name)
threatened_filter# A tibble: 83 × 11
   taxon_concept_id       species_name scientific_name_auth…¹ taxon_rank kingdom
   <chr>                  <chr>        <chr>                  <chr>      <chr>  
 1 https://biodiversity.… Potorous tr… (McCoy, 1865)          subspecies Animal…
 2 https://biodiversity.… Haematopus … Vieillot, 1817         species    Animal…
 3 https://biodiversity.… Perameles n… Geoffroy, 1804         species    Animal…
 4 https://biodiversity.… Haematopus … Gould, 1845            species    Animal…
 5 https://biodiversity.… Sternula al… (Pallas, 1764)         species    Animal…
 6 https://biodiversity.… Dasyurus ma… (Kerr, 1792)           species    Animal…
 7 https://biodiversity.… Callocephal… (Grant, 1803)          species    Animal…
 8 https://biodiversity.… Esacus magn… Vieillot, 1818         species    Animal…
 9 https://biodiversity.… Tyto novaeh… (Stephens, 1826)       species    Animal…
10 https://biodiversity.… Hirundapus … (Latham, 1801)         species    Animal…
# ℹ 73 more rows
# ℹ abbreviated name: ¹scientific_name_authorship
# ℹ 6 more variables: phylum <chr>, class <chr>, order <chr>, family <chr>,
#   genus <chr>, vernacular_name <chr>To preserve status information, instead we can join species_shoal and nsw_threatened_list dataframes, which will retain columns while still filtering results.
threatened_joined <- species_shoal |>
  left_join(
    nsw_threatened_list |>
      select(scientific_name, common_name, nsw_status, comm_status),
    join_by(species_name == scientific_name)
  ) |>
  filter(!is.na(nsw_status))
threatened_joined# A tibble: 87 × 14
   taxon_concept_id       species_name scientific_name_auth…¹ taxon_rank kingdom
   <chr>                  <chr>        <chr>                  <chr>      <chr>  
 1 https://biodiversity.… Potorous tr… (McCoy, 1865)          subspecies Animal…
 2 https://biodiversity.… Haematopus … Vieillot, 1817         species    Animal…
 3 https://biodiversity.… Perameles n… Geoffroy, 1804         species    Animal…
 4 https://biodiversity.… Perameles n… Geoffroy, 1804         species    Animal…
 5 https://biodiversity.… Haematopus … Gould, 1845            species    Animal…
 6 https://biodiversity.… Sternula al… (Pallas, 1764)         species    Animal…
 7 https://biodiversity.… Dasyurus ma… (Kerr, 1792)           species    Animal…
 8 https://biodiversity.… Callocephal… (Grant, 1803)          species    Animal…
 9 https://biodiversity.… Esacus magn… Vieillot, 1818         species    Animal…
10 https://biodiversity.… Tyto novaeh… (Stephens, 1826)       species    Animal…
# ℹ 77 more rows
# ℹ abbreviated name: ¹scientific_name_authorship
# ℹ 9 more variables: phylum <chr>, class <chr>, order <chr>, family <chr>,
#   genus <chr>, vernacular_name <chr>, common_name <chr>, nsw_status <chr>,
#   comm_status <chr>Species lists from BioNet Atlas can sometimes return both species and specific populations of the same species with their own conservation status, respectively. When matching species names, this means there can be multiple matches for the same species. For example, there are seperate conservation statuses assigned to the Yellow bellied glider and a Yellow bellied glider population in the Bago Plateau.
Code
threatened_joined |>
  filter(species_name == "Petaurus australis") |>
  select(species_name, common_name, nsw_status)# A tibble: 2 × 3
  species_name       common_name                                      nsw_status
  <chr>              <chr>                                            <chr>     
1 Petaurus australis Yellow-bellied Glider                            V,P       
2 Petaurus australis Yellow-bellied Glider population on the Bago Pl… E2,V,P    These multiple statuses explain why there are several more rows when we join dataframes (threatened_joined) compared to when we filter by species names (threatened_filter).
You might notice that fewer species are returned when using an externally downloaded list than using galah. This discrepancy is due to differences in scientific names between those on the BioNet Atlas and those on the ALA. Names mismatches are a risk when using external species lists. Additional work is usually needed to avoid unexpected mismatches. The Cleaning Biodiversity Data in R book details some methods for finding name synonyms, but amending taxonomic names can be difficult.
When ALA ingests data, it matches those data to the ALA’s taxonomic backbone, with the goal of minimising name mismatches. We recommend using galah because it makes names matching easier. However, not all lists exist on the ALA, so some tasks inevitably require matching to externally downloaded lists.
To use this list for summarising or plotting, it might be useful to add to threatened_joined status information for each species as vulnerable, endangered, critically endangered or extinct. To add this info, we’ll extract the first value of nsw_status by removing everything after the first comma and save that value in nsw_status_extracted. Then we’ll recode these values6 and save them in nsw_status_simple.
threatened_clean <- threatened_joined |>
  mutate(
    nsw_status_extracted = stringr::str_remove_all(nsw_status, "\\,.*"),
    nsw_status_simple = case_match(
      nsw_status_extracted,
      "V" ~ "Vulnerable",
      c("E1", "E2", "E3") ~ "Endangered",
      c("E4A") ~ "Critically Endangered",
      c("E4") ~ "Extinct",
      .default = nsw_status_extracted
    )
  )
threatened_clean |>
  # re-position cols
  select(nsw_status, nsw_status_extracted, nsw_status_simple, species_name, everything())# A tibble: 87 × 16
   nsw_status nsw_status_extracted nsw_status_simple     species_name           
   <chr>      <chr>                <chr>                 <chr>                  
 1 V,P        V                    Vulnerable            Potorous tridactylus t…
 2 E1,P       E1                   Endangered            Haematopus longirostris
 3 E2,P       E2                   Endangered            Perameles nasuta       
 4 E2,P       E2                   Endangered            Perameles nasuta       
 5 V,P        V                    Vulnerable            Haematopus fuliginosus 
 6 E1,P       E1                   Endangered            Sternula albifrons     
 7 V,P        V                    Vulnerable            Dasyurus maculatus     
 8 E1,P,3     E1                   Endangered            Callocephalon fimbriat…
 9 E4A,P      E4A                  Critically Endangered Esacus magnirostris    
10 V,P,3      V                    Vulnerable            Tyto novaehollandiae   
# ℹ 77 more rows
# ℹ 12 more variables: taxon_concept_id <chr>,
#   scientific_name_authorship <chr>, taxon_rank <chr>, kingdom <chr>,
#   phylum <chr>, class <chr>, order <chr>, family <chr>, genus <chr>,
#   vernacular_name <chr>, common_name <chr>, comm_status <chr>Whichever method you’ve followed, you will end up with very similar datasets containing threatened species and their statuses, though the number of matched species might differ7.
threatened_status |>
  rmarkdown::paged_table()threatened_clean |>
  select(species_name, nsw_status_simple, everything()) |>
  rmarkdown::paged_table()To finish, we can save our dataframe as a csv file.
# save
write.csv(threatened_status,
          here("path", "to", "file-name.csv"))Visualise species conservation status
Along with a species list, we can also summarise threatened_status visually. Few options are as simple and easy-to-understand than a bar plot. Here we’ve made a simple bar plot displaying the number of species by conservation status, and styled it with a custom font and some nicer colours.
Code
# custom font
font_add_google("Roboto")
showtext_auto()
# count number of species by status
status_count <- threatened_status |>
  group_by(status) |>
  count()
# bar plot
bar_status <- 
  status_count |>
  arrange(-n) |>
  ggplot() +
  geom_bar(
    mapping = aes(x = status,
                  y = n,
                  fill = status),
    stat = "identity",
    colour = "transparent"
  ) + 
  labs(title = "Threatened species status in Shoalhaven, NSW (2024)",
       x = "Conservation status",
       y = "Number of species") +
  scale_fill_manual(values = c('#ab423f', '#cd826d', '#ebc09e'),
                    labels = c("Vulnerable", "Endangered", "Critically Endangered")) +
  pilot::theme_pilot(legend_position = "none",
                     grid = "",
                     axes = "l") + 
  theme(text = element_text(family = "Roboto"),
        plot.title = element_text(size = 29),
        axis.title = element_text(size = 18),
        axis.text = element_text(size = 16))
bar_statusA useful but more exciting way to see a taxonomic breakdown of species is using a waffle chart. Waffle charts are great because they display number and proportion all at once. For more advanced R users, waffle charts can be a useful summary tool.
Code
library(waffle)
library(glue)
library(marquee)
# Count number of species by taxonomic group
taxa_table <- threatened_status |>
  mutate(
    taxa_group = case_when(
      class == "Aves" ~ "Birds",
      class == "Reptilia" ~ "Reptiles",
      class == "Mammalia" ~ "Mammals",
      kingdom == "Plantae" ~ "Plants",
      .default = "Other"
    )
  ) |>
  group_by(taxa_group) |>
  summarise(n = n()) |>
  mutate(proportion = n/sum(n)*100)
# waffle chart
waffle_taxa <- 
  ggplot() +
  waffle::geom_waffle(
    data = taxa_table |> arrange(-n),             # reorder highest to lowest
    mapping = aes(fill = reorder(taxa_group, -n), # reorder legend
                  values = n),
    colour = "white",
    n_rows = 8,
    size = 1
    ) +
  scale_fill_manual(name = "",
                    values = c('#567c7c', '#687354', '#C3CB80', '#c4ac79', '#38493a'),
                    labels = c("Birds", "Mammals", "Plants", "Reptiles", "Other")
                    ) +
  labs(title = marquee_glue("Taxonomic breakdown of threatened species in Shoalhaven, NSW (2024)"),
       caption = marquee_glue("1 {cli::symbol$square_small_filled} = 1 species")) +
  coord_equal() + 
  theme_void() + 
  theme(legend.position = "bottom",
        text = element_text(family = "Roboto"),
        legend.title = element_text(hjust = 0.5, size = 20),
        legend.text = element_text(size = 18),
        plot.title = element_marquee(hjust = 0.5, size = 14, margin = margin(b=5), family = "Roboto"),
        plot.caption = element_marquee(size = 12, hjust = 1),
        plot.margin = margin(0.5, 1, 0.5, 1, unit = "cm"))
waffle_taxaFinal thoughts
We hope this post has helped you understand how to download a species list for a specific area and compare it to conservation lists. It’s also possible to compare species with other information like lists of migratory species or seasonal species.
For other posts, check out our beginner’s guide to map species observations or see an investigation of dingo observations in the ALA.
Expand for session info
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.5.0 (2025-04-11 ucrt)
 os       Windows 11 x64 (build 22631)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_Australia.utf8
 ctype    English_Australia.utf8
 tz       Australia/Sydney
 date     2025-07-23
 pandoc   3.4 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
─ Packages ───────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 dplyr       * 1.1.4   2023-11-17 [1] CRAN (R 4.3.2)
 forcats     * 1.0.0   2023-01-29 [1] CRAN (R 4.3.2)
 galah       * 2.1.2   2025-06-12 [1] CRAN (R 4.5.0)
 ggplot2     * 3.5.1   2024-04-23 [1] CRAN (R 4.4.3)
 glue        * 1.8.0   2024-09-30 [1] CRAN (R 4.4.2)
 here        * 1.0.1   2020-12-13 [1] CRAN (R 4.3.2)
 htmltools   * 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.1)
 lubridate   * 1.9.4   2024-12-08 [1] CRAN (R 4.4.2)
 marquee     * 1.0.0   2025-01-20 [1] CRAN (R 4.5.0)
 ozmaps      * 0.4.5   2021-08-03 [1] CRAN (R 4.3.2)
 pilot       * 4.0.0   2022-07-13 [1] Github (olihawkins/pilot@f08cc16)
 purrr       * 1.0.4   2025-02-05 [1] CRAN (R 4.4.3)
 readr       * 2.1.5   2024-01-10 [1] CRAN (R 4.3.3)
 readxl      * 1.4.3   2023-07-06 [1] CRAN (R 4.3.2)
 rmapshaper  * 0.5.0   2023-04-11 [1] CRAN (R 4.3.2)
 sessioninfo * 1.2.2   2021-12-06 [1] CRAN (R 4.3.2)
 sf          * 1.0-20  2025-03-24 [1] CRAN (R 4.4.3)
 showtext    * 0.9-7   2024-03-02 [1] CRAN (R 4.4.1)
 showtextdb  * 3.0     2020-06-04 [1] CRAN (R 4.3.2)
 stringr     * 1.5.1   2023-11-14 [1] CRAN (R 4.3.2)
 sysfonts    * 0.8.9   2024-03-02 [1] CRAN (R 4.4.1)
 tibble      * 3.2.1   2023-03-20 [1] CRAN (R 4.3.2)
 tidyr       * 1.3.1   2024-01-24 [1] CRAN (R 4.3.3)
 tidyverse   * 2.0.0   2023-02-22 [1] CRAN (R 4.3.2)
 waffle      * 1.0.2   2024-05-03 [1] Github (hrbrmstr/waffle@767875b)
 [1] C:/Users/KEL329/R-packages
 [2] C:/Users/KEL329/AppData/Local/Programs/R/R-4.5.0/library
──────────────────────────────────────────────────────────────────────────────Footnotes
- Each spatial layer has a two letter code, along with a number to identify it. The abbreviations are as follows: 
 *- cl= contextual layer (i.e. boundaries of LGAs, Indigenous Protected Areas, States/Territories etc.)
 *- 11170= number associated with the spatial layer in the atlas↩︎
- We used - right_join()this time because we wanted to first select columns from- nsw_threatened, then join so that we keep all 90+ rows in- threatened(using- left_join()would keep all 1,000+ rows in- nsw_threatenedinstead).↩︎
- Simplifying a shapefile removes the number of total points that draw the shape outline.↩︎ 
- Check out this post for a better explanation of what CRS is and how it affects maps.↩︎ 
- On a related note, it’s possible to download a list specifically for Shoalhaven on the BioNet Atlas website. However, results from BioNet will be matched BioNet records only. As a result, fewer species will be identifed compared to the ALA, which matches NSW BioNet data as well as data from other sources.↩︎ 
- We can double check status information by viewing the species list in Excel and clicking on links in the - infocolumn. This is handy for double checking species status codes or learning more about each species and status.↩︎
- This is due to differences in taxonomic names in the externally downloaded list and in ALA data. More info can be found under the “Names Matching” tab in the Shapefile + list section.↩︎