{rworms}: an R package for retrieving information from WoRMS

Multivariate Analysis
Published

February 9, 2026

This is a by-product extracted from my master thesis that uses the WoRMS API to retrieve data in automation to make the taxonomic work slightly easier. I started writing the functions when doing the work for my thesis, and this is made publicly available first during the Ocean Census Isopod Workshop.

Suggestions, issues, bugs, please let me know!

The package is available to download from github, it is therefore also open source:

# install from github
devtools::install_github("https://github.com/zzzhehao/rworms.git")
library(rworms)


It is currently equipped with three functions, the core input for all of them is the AphiaID of the taxon in interest.

get_taxonomy

get_taxonomy() takes a numeric vector of AphiaIDs (of species) and will get the taxonomic informations for all of them, if the AphiaIDs are valid. Personally, I find the reference_citation (also reference_year and reference_doi, if available) fields being the most helpful. As with this, I do not need to open every page on WoRMS to just get the title of publication which described the species.

For example, we want to have look on the Janirellidae Menzies, 1956, whose AphiaID is 118257:

jani <- get_taxonomy(118257)
jani

A one-row table is returned, as we only supplied with one AphiaID. Should more valid IDs be provided, they will be returned in rows. Different taxon ranks can be mixed here.

zztax <- get_taxonomy(c(118333, 183189, 175137))
zztax

Of course, the function checks if the AphiaID is valid:

error <- tryCatch({

    # Call with invalid AphiaID:
    get_taxonomy(c(118333, 183189, 175137000))

    }, error = function(e) {e})
print(error)
<error/rlang_error>
Error in `.is_valid_aphiaID()`:
! AphiaID 175137000 is not valid.
---
Backtrace:
    ▆
 1. ├─base::tryCatch(...)
 2. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 3. │   └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 4. │     └─base (local) doTryCatch(return(expr), name, parentenv, handler)
 5. ├─rworms::get_taxonomy(c(118333, 183189, 175137000))
 6. │ ├─AphiaRecordsFullByAphiaIDs(aphiaIDs) %>% ...
 7. │ └─rworms:::AphiaRecordsFullByAphiaIDs(aphiaIDs)
 8. │   └─rworms:::.is_valid_aphiaID(aphiaIDs)
 9. └─dplyr::mutate(., aphiaID = as.numeric(aphiaID))

get_taxonomy_hr

More commonly is the case, that one would like to have the information from all the species belonging to a higher taxon, such as genus, or a family. For such purpose, get_taxonomy_hr() takes one AphiaID and returns the taxonomic information from all species belonging to that taxon.

For instance, we give the ID of the Janirellidae and get all species belonging to that family:

janis <- get_taxonomy_hr(118257)
janis

This function also takes additional arguments which can alter the search filter. Run ?get_taxonomy_hr for details. Since the arguments are passed via ..., you would have to specify the argument. Options are:

  • recursive: logical. Whether to loop through children that are not species. If TRUE (default), all species belonging to given taxon is returned, if FALSE, only the direct children are returned.
  • accept: logical. Whether to only return the accepted taxa.
  • marine: logical. Whether to only return the marine taxa.
  • extant: logical. Whether to only return the extant taxa.

A tip: if running search accept = F, it might be smart to do it also with recursive = F and cherry-pick the taxa that are really interesting for you. Otherwise it might retrieve tons of data that is not needed.

janiroids <- get_taxonomy_hr(155716, accept = F, recursive = F)
janiroids

get_distribution

If one is not satisfied with only the taxonomic informations, it also makes sense to have a look at the distribution records. However, please note that many occurrence records are not landing in WoRMS, but more commonly in other plattforms such as OBIS, and GBIF. They also have their APIs, through which the data can be conveniently retrieved within R, and yes, I have the codes. They are just not written in generalized functions yet. (But they are coming soon).

As we just got all the taxonomic data from Janirellids, this also includes AphiaIDs from all the species, so now we can simply pass that IDs to the function (so yes, this function is vectorized and accepts numeric vector, just like get_taxonomy).

janis.dist <- get_distribution(janis$aphiaID)
janis.dist

As not all species has distribution data available on WoRMS, the function will probably complain about not finding the data. As long as the function run is not aborted, you will get all available result at the end. This might get a few minutes to retrieve and parse the data, so please be patient and don’t panic.

At last, since all data returned by these functions are in data.frame, you can use the normal R functions to export them in all possible way you like.

write.table(janis.dist, "Janirellidae_distribution.csv", sep = ";", row.names = F)

Reference

R session


R version 4.5.0 (2025-04-11) Platform: aarch64-apple-darwin20 Running under: macOS Sequoia 15.5

Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin tzcode source: internal

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] rworms_0.0.0.9000 lubridate_1.9.5 forcats_1.0.0 stringr_1.6.0
[5] dplyr_1.2.0 purrr_1.2.1 readr_2.1.5 tidyr_1.3.2
[9] tibble_3.3.1 ggplot2_4.0.2 tidyverse_2.0.0 knitr_1.51
[13] rmarkdown_2.30 pacman_0.5.1 zWeb_0.0.1

Packages


knitr

Version: 1.51

Xie Y (2025). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.51, https://yihui.org/knitr/.

Xie Y (2015). Dynamic Documents with R and knitr, 2nd edition. Chapman and Hall/CRC, Boca Raton, Florida. ISBN 978-1498716963, https://yihui.org/knitr/.

Xie Y (2014). “knitr: A Comprehensive Tool for Reproducible Research in R.” In Stodden V, Leisch F, Peng RD (eds.), Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN 978-1466561595.

pacman

Version: 0.5.1

Rinker TW, Kurkiewicz D (2018). pacman: Package Management for R. version 0.5.0, http://github.com/trinker/pacman.

rmarkdown

Version: 2.30

Allaire J, Xie Y, Dervieux C, McPherson J, Luraschi J, Ushey K, Atkins A, Wickham H, Cheng J, Chang W, Iannone R (2025). rmarkdown: Dynamic Documents for R. R package version 2.30, https://github.com/rstudio/rmarkdown.

Xie Y, Allaire J, Grolemund G (2018). R Markdown: The Definitive Guide. Chapman and Hall/CRC, Boca Raton, Florida. ISBN 9781138359338, https://bookdown.org/yihui/rmarkdown.

Xie Y, Dervieux C, Riederer E (2020). R Markdown Cookbook. Chapman and Hall/CRC, Boca Raton, Florida. ISBN 9780367563837, https://bookdown.org/yihui/rmarkdown-cookbook.

tidyverse

Version: 2.0.0

Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.

Citation

BibTeX citation:
@online{hu2026,
  author = {Hu, Zhehao},
  title = {\{Rworms\}: An {R} Package for Retrieving Information from
    {WoRMS}},
  date = {2026-02-09},
  url = {https://zzzhehao.github.io/post/research/techs/rworms.html},
  langid = {en}
}
For attribution, please cite this work as:
Hu, Zhehao. 2026. “{Rworms}: An R Package for Retrieving Information from WoRMS.” February 9, 2026. https://zzzhehao.github.io/post/research/techs/rworms.html.