library(biosysR)

Setup

The BioSys API is only accessible with basicauth using a valid Biosys username and password.

All biosysR functions calling the BioSys API expect optional parameters un and pw, which default to environment variables BIOSYS_UN and BIOSYS_UN, respectively. Calling biosysR functions with invalid or empty, credentials, or defaulting to non existing BIOSYS_UN and BIOSYS_PW will fail with an informative error message prompting for correct authentication.

There are three ways to supply these authentication credentials to biosysR functions.

Permanent authentication

To set-and-forget BioSys authentication, add to your ~/.Rprofile:

Sys.setenv(BIOSYS_UN = "USERNAME")
Sys.setenv(BIOSYS_PW = "PASSWORD")

Every new R session will already contain these variables.

Session authentication

To authenticate one session, export BIOSYS_UN/PW as environment variables:

Sys.setenv(BIOSYS_UN = "USERNAME")
Sys.setenv(BIOSYS_PW = "PASSWORD")

Restarting the R session will clear these variables.

Per request authentication

Supply the variables to each biosysR function:

  • biosys_projects
  • biosys_datasets
  • biosys_records
projects <- biosys_projects(un="USERNAME", pw="PASSWORD")

Doing so will hand un and pw to biosys_get, which builds the authentication headers and uses them in the request to BioSys.

Data flow

Accessing data from BioSys

Helper functions

  • Data is retrieved from the BioSys API through an HTTP GET with basicauth.
  • The JSON returned from the BioSys API is parsed into a tibble.
  • All heavy lifting is factored out into helper functions.

BioSys projects

projects <- biosys_projects()
dplyr::glimpse(projects)
## Observations: 7
## Variables: 13
## $ id                <chr> "1", "2", "3", "4", "7", "6", "5"
## $ name              <chr> "Berkeley Incidental Records", "Kimberley Is...
## $ code              <chr> "BER", "KI", "LCI", "KNC", "PRS", "SBS", "SCTI"
## $ description       <chr> "Incidental mainland records captured as par...
## $ site_count        <int> 41, 163, 208, 27, 104, 0, 64
## $ dataset_count     <int> 3, 9, 13, 10, 3, 2, 8
## $ record_count      <int> 154, 42118, 14561, 1621, 726, 3696, 3730
## $ longitude         <dbl> 127.8207, 125.5086, 126.8049, 128.5613, NA, ...
## $ latitude          <dbl> -14.48498, -14.60075, -15.54562, -16.08126, ...
## $ datum             <chr> "4326", "4326", "4326", "4326", "4326", "432...
## $ timezone          <chr> "Australia/Perth", "Australia/Perth", "Austr...
## $ site_data_package <list> [NULL, NULL, NULL, NULL, NULL, NULL, NULL]
## $ custodians        <list> [2, 2, 2, 2, 2, [2, 9, 12], 2]
DT::datatable(projects)

BioSys datasets

datasets <- biosys_datasets(project_id = 6)
dplyr::glimpse(datasets)
## Observations: 48
## Variables: 7
## $ id           <chr> "101", "107", "118", "30", "45", "99", "108", "11...
## $ record_count <int> 4582, 38, 3307, 33, 414, 426, 95, 42, 23, 1163, 6...
## $ data_package <list> [["tabular-data-package", "BioSys Config", "anim...
## $ name         <chr> "Animal Observations", "Animal Observations", "An...
## $ type         <chr> "species_observation", "species_observation", "sp...
## $ description  <chr> "", "", "", "", "", "", "", "", "", "", "", "", "...
## $ project_id   <int> 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 3, 3, 4, 5, 2...
listviewer::jsonedit(datasets$data_package)
DT::datatable(datasets)
## Warning in instance$preRenderHook(instance): It seems your data is too
## big for client-side DataTables. You may consider server-side processing:
## http://rstudio.github.io/DT/server.html

BioSys records

records <- biosys_records(project_id = 6)
DT::datatable(head(records, n = 100))

Example data

The example data in this project were produced by saving one project’s project metadata, dataset metadata, and a subset of records to the data/ folder.

projects <- biosys_projects()[6,]
datasets <- biosys_datasets(project_id = 6)[1,]
records <- biosys_records(project_id = 6) %>% head(n = 100)
devtools::use_data(projects, datasets, records)