Download a species list and cross-reference with conservation status lists in Python

Knowing what species have been observed in a local area is a regular task for ecosystem management. Here we show how to make a species list with {galah-python} and how to cross-reference this list with threatened and sensitive species lists. We then show how to visualise this information as a waffle chart using {pywaffle} & {matplotlib}.

Eukaryota
Animalia
Plantae
Summaries
Python
Authors

Dax Kellie

Amanda Buyan

Published

February 12, 2024

Modified

March 5, 2026

Author

Dax Kellie
Amanda Buyan

Date

20 July 2025

Tip

Updated 5 March, 2026

Knowing what species inhabit an area is important for conservation and ecosystem management. In particular, it can help us find how many known species are in a given area, and whether any species are vulnerable or endangered.

In this post, we will present two options, one using the galah-python package, the other using an external shapefile and list. Using either workflow, we will show you how to download a list of species within a Local Government Area (Shoalhaven, NSW), cross-reference this list with a state conservation status list, and visualise the number of threatened species in the region with pywaffle and plotnine.

For those unfamiliar with Australian geography, Shoalhaven is located here:

Let’s first load our packages. To download species lists, you will also need to enter a registered email with the ALA using galah_config().

import galah

galah.galah_config(email = "your-email-here") # ALA-registered email

Download threatened species in an area

Choose which method you would like to view:

  • {galah-python} (using fields downloaded from the Atlas of Living Australia)
  • Downloaded shapefile + species list

The method you choose depends on whether the region or list you wish to return species for is already in {galah-python}, or whether you wish to filter for a more specific area defined by a separate shapefile or list. Keep in mind that using an external list may require additional work matching taxonomic names.

Whichever method you’ve followed, you will end up with very similar datasets containing threatened species and their statuses, though the number of matched species might differ7.

To finish, we can save our dataframe as a csv file.

# save
pd.to_csv(threatened_status,
          'path/to/file-name.csv')

Visualise species conservation status

Along with a species list, we can also summarise threatened_status visually. Few options are as simple and easy-to-understand than a bar plot. Here we’ve made a simple bar plot displaying the number of species by conservation status, and styled it with a custom font and some nicer colours.

For nicer formatting, we can use the Roboto font from Google Fonts by downloading the font from Google Fonts, saving the folder in your current directory, unzipping the folder, and loading it with the matplotlib library.

Code
# import packages for font management
import matplotlib.font_manager as fm
import os

# add Roboto custom font
project_dir = os.getcwd()
font_path = os.path.join(project_dir, "Roboto-VariableFont_wdth,wght.ttf")
prop = fm.FontProperties(fname=font_path)

# bar plot
bar_status = (
  ggplot(threatened_status) # 
  + geom_bar(aes(x='status',fill='status')) # added colour
  + labs(title = "Threatened species status in Shoalhaven, NSW (2024)",
         x = "Conservation status",
         y = "Number of species") 
  + scale_fill_manual(values = ['#ab423f', '#cd826d', '#ebc09e'],
                      labels = ["Vulnerable", "Endangered", "Critically Endangered"]) 
  + theme_classic() # theme_minimal
  + theme(
        legend_position = "none",
        plot_title = element_text(fontproperties=prop, size = 140),
        axis_title_x = element_text(fontproperties=prop, size = 10, vjust = -0.1), 
        axis_title_y = element_text(angle = 90, fontproperties=prop, size = 10, hjust = -.1),
        axis_text_x = element_text(fontproperties=prop, size = 8, vjust = -0.1),
        axis_text_y = element_text(fontproperties=prop, size = 8),
        plot_background = element_rect(fill = "#ffffff")
  )
)

bar_status.show()

A useful but more exciting way to see a taxonomic breakdown of species is using a waffle chart. Waffle charts are great because they display number and proportion all at once. For more advanced Python users, waffle charts can be a useful summary tool. Unfortunately, waffle charts aren’t available yet in {plotnine}, so we will be using {matplotlib}.

Code
import matplotlib.pyplot as plt
from pywaffle import Waffle 
import matplotlib as mpl
from matplotlib import font_manager
from matplotlib import rcParams

# condense all taxa in one column 
threatened_status['taxa_for_waffle'] = threatened_status.apply(
              lambda row: 
                row['Kingdom'] if (row['Kingdom'] == 'Plantae') 
                               else (row['Class'] if row['Class'] in ['Aves','Reptilia','Mammalia'] else 'Other'),
              axis=1
    ).replace({'Aves': 'Birds', 'Reptilia': 'Reptiles', 'Mammalia': 'Mammals','Plantae': 'Plants'})

# import Roboto font into matplotlib
font_files = font_manager.findSystemFonts(fontpaths="Roboto/")
for ff in font_files:
  font_manager.fontManager.addfont(ff)
rcParams['font.family'] = 'Roboto'

# set up the plot
fig,ax = plt.subplots(figsize=(10,6))
status_counts = list(threatened_status['taxa_for_waffle'].value_counts())

# make waffle chart
Waffle.make_waffle(
    ax=ax,
    rows=8,
    values=status_counts, 
    colors = ['#567c7c', '#687354', '#C3CB80', '#c4ac79', '#38493a'],
    interval_ratio_x = 0.1,
    interval_ratio_y = 0.1,
    legend={
        'labels': ['Birds', 'Mammals', 'Plants', 'Reptiles', 'Other'],
        'loc': 'lower center',
        'bbox_to_anchor': (0.5,-.1),
        'ncol': 5,
        'framealpha': 0,
        'fontsize': 14
    }
)

# make plot prettier
plt.suptitle('Taxonomic breakdown of threatened species in Shoalhaven, NSW (2024)',fontsize=20,font='Roboto')
fig.text(0.75,0.02,"*1 square = 1 species",fontsize=16)

plt.show()

Final thoughts

We hope this post has helped you understand how to download a species list for a specific area and compare it to conservation lists. It’s also possible to compare species with other information like lists of migratory species or seasonal species.

For other posts, check out our beginner’s guide to map species observations or see an investigation of dingo observations in the ALA.

Expand for session info

-----
IPython             9.10.0
galah               0.13.0
geopandas           1.1.2
itables             2.7.0
janitor             0.32.20
matplotlib          3.10.8
natsort             8.4.0
pandas              3.0.1
plotnine            0.15.3
pywaffle            NA
session_info        v1.0.1
shapely             2.1.2
-----
Python 3.14.3 (tags/v3.14.3:323c59a, Feb  3 2026, 16:04:56) [MSC v.1944 64 bit (AMD64)]
Windows-11-10.0.26100-SP0
-----
Session information updated at 2026-03-05 14:02

Footnotes

  1. Each spatial layer has a two letter code, along with a number to identify it. The abbreviations are as follows:
    * cl = contextual layer (i.e. boundaries of LGAs, Indigenous Protected Areas, States/Territories etc.)
    * 11170 = number associated with the spatial layer in the atlas↩︎

  2. We used right_join() this time because we wanted to first select columns from nsw_threatened, then join so that we keep all 90+ rows in threatened (using left_join() would keep all 1,000+ rows in nsw_threatened instead).↩︎

  3. Check out this post for a better explanation of what CRS is and how it affects maps.↩︎

  4. Simplifying a shapefile removes the number of total points that draw the shape outline.↩︎

  5. On a related note, it’s possible to download a list specifically for Shoalhaven on the BioNet Atlas website. However, results from BioNet will be matched BioNet records only. As a result, fewer species will be identifed compared to the ALA, which matches NSW BioNet data as well as data from other sources.↩︎

  6. We can double check status information by viewing the species list in Excel and clicking on links in the info column. This is handy for double checking species status codes or learning more about each species and status.↩︎

  7. This is due to differences in taxonomic names in the externally downloaded list and in ALA data. More info can be found under the “Names Matching” tab in the Shapefile + list section.↩︎