```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## Discover event data in GBIF ## Download dwc-a from IPT Some essential event-type data terms are currently not searchable through the GBIF API, you can use metadata search through the "datasets" call of rgbif, or search through the GBIF portal and locate the UUID of the dataset. Here, we download the data set as a DwC-A directly from the IPT using the following workflow. 1. Use the [jsonlite package](https://cran.r-project.org/web/packages/jsonlite/vignettes/json-apis.html) to get the endpoint for the raw dwc-a representation of the dataset at the IPT installation. This takes the raw API call "http://api.gbif.org/v1/dataset/'dataset UUID'/endpoint", downloads information as JSON. This is adress is can also be found by scrolling down on the dataset homepage to the [data description section](https://www.gbif.org/dataset/78360224-5493-45fd-a9a0-c336557f09c3#dataDescription) 2. Then we extract the DWC-A endpoint url, and use the [curl package](https://cran.r-project.org/web/packages/curl/vignettes/intro.html) to download this to a temporary file 3. Finally we extract the event and occurrence table and join the table with the left_join command from the [dplyr package](https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html). Note that we here remove id field id "%>% select(-id)" from the occurrence and event table respectively to awoid duplicated column names (the id field represent the eventID and occurrenceID respectively) ```{r find_dataset_endpoint, warning=F,message=F} library(RJSONIO) library(curl) library(dplyr) dataset <- RJSONIO::fromJSON("http://api.gbif.org/v1/dataset/78360224-5493-45fd-a9a0-c336557f09c3/endpoint") endpoint_url <- dataset[[1]]$url # Download from dwc-a from IPT tmp <- tempfile() # create temporary file for download curl_download(endpoint_url, tmp) archive_files <- unzip(tmp, files = "NULL", list = T) unzip(tmp, list = F) # extract occurrence and event tables from dwc-a and join occurrence <- occurrence_temp <- read.table("occurrence.txt",sep="\t",header = T, stringsAsFactors = FALSE) %>% select(-id) event <- occurrence_temp <- read.table("event.txt",sep="\t",header = T, stringsAsFactors = FALSE) %>% select(-id) df <- left_join(event,occurrence,by="eventID") head(df) ```