jani <- get_taxonomy(118257)
jani{rworms}: an R package for retrieving information from WoRMS
This is a by-product extracted from my master thesis that uses the WoRMS API to retrieve data in automation to make the taxonomic work slightly easier. I started writing the functions when doing the work for my thesis, and this is made publicly available first during the Ocean Census Isopod Workshop.
Suggestions, issues, bugs, please let me know!
The package is available to download from github, it is therefore also open source:
# install from github
devtools::install_github("https://github.com/zzzhehao/rworms.git")
library(rworms)It is currently equipped with three functions, the core input for all of them is the AphiaID of the taxon in interest.
get_taxonomy
get_taxonomy() takes a numeric vector of AphiaIDs (of species) and will get the taxonomic informations for all of them, if the AphiaIDs are valid. Personally, I find the reference_citation (also reference_year and reference_doi, if available) fields being the most helpful. As with this, I do not need to open every page on WoRMS to just get the title of publication which described the species.
For example, we want to have look on the Janirellidae Menzies, 1956, whose AphiaID is 118257:
A one-row table is returned, as we only supplied with one AphiaID. Should more valid IDs be provided, they will be returned in rows. Different taxon ranks can be mixed here.
zztax <- get_taxonomy(c(118333, 183189, 175137))
zztaxOf course, the function checks if the AphiaID is valid:
error <- tryCatch({
# Call with invalid AphiaID:
get_taxonomy(c(118333, 183189, 175137000))
}, error = function(e) {e})
print(error)<error/rlang_error>
Error in `.is_valid_aphiaID()`:
! AphiaID 175137000 is not valid.
---
Backtrace:
▆
1. ├─base::tryCatch(...)
2. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
3. │ └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
4. │ └─base (local) doTryCatch(return(expr), name, parentenv, handler)
5. ├─rworms::get_taxonomy(c(118333, 183189, 175137000))
6. │ ├─AphiaRecordsFullByAphiaIDs(aphiaIDs) %>% ...
7. │ └─rworms:::AphiaRecordsFullByAphiaIDs(aphiaIDs)
8. │ └─rworms:::.is_valid_aphiaID(aphiaIDs)
9. └─dplyr::mutate(., aphiaID = as.numeric(aphiaID))
get_taxonomy_hr
More commonly is the case, that one would like to have the information from all the species belonging to a higher taxon, such as genus, or a family. For such purpose, get_taxonomy_hr() takes one AphiaID and returns the taxonomic information from all species belonging to that taxon.
For instance, we give the ID of the Janirellidae and get all species belonging to that family:
janis <- get_taxonomy_hr(118257)
janisThis function also takes additional arguments which can alter the search filter. Run ?get_taxonomy_hr for details. Since the arguments are passed via ..., you would have to specify the argument. Options are:
recursive: logical. Whether to loop through children that are not species. If TRUE (default), all species belonging to given taxon is returned, if FALSE, only the direct children are returned.accept: logical. Whether to only return the accepted taxa.marine: logical. Whether to only return the marine taxa.extant: logical. Whether to only return the extant taxa.
A tip: if running search accept = F, it might be smart to do it also with recursive = F and cherry-pick the taxa that are really interesting for you. Otherwise it might retrieve tons of data that is not needed.
janiroids <- get_taxonomy_hr(155716, accept = F, recursive = F)
janiroidsget_distribution
If one is not satisfied with only the taxonomic informations, it also makes sense to have a look at the distribution records. However, please note that many occurrence records are not landing in WoRMS, but more commonly in other plattforms such as OBIS, and GBIF. They also have their APIs, through which the data can be conveniently retrieved within R, and yes, I have the codes. They are just not written in generalized functions yet. (But they are coming soon).
As we just got all the taxonomic data from Janirellids, this also includes AphiaIDs from all the species, so now we can simply pass that IDs to the function (so yes, this function is vectorized and accepts numeric vector, just like get_taxonomy).
janis.dist <- get_distribution(janis$aphiaID)
janis.distAs not all species has distribution data available on WoRMS, the function will probably complain about not finding the data. As long as the function run is not aborted, you will get all available result at the end. This might get a few minutes to retrieve and parse the data, so please be patient and don’t panic.
At last, since all data returned by these functions are in data.frame, you can use the normal R functions to export them in all possible way you like.
write.table(janis.dist, "Janirellidae_distribution.csv", sep = ";", row.names = F)Reference
Citation
@online{hu2026,
author = {Hu, Zhehao},
title = {\{Rworms\}: An {R} Package for Retrieving Information from
{WoRMS}},
date = {2026-02-09},
url = {https://zzzhehao.github.io/post/research/techs/rworms.html},
langid = {en}
}
