library(biosysR)

Setup

The BioSys API is only accessible with basicauth using a valid Biosys username and password.

All biosysR functions calling the BioSys API expect optional parameters un and pw, which default to environment variables BIOSYS_UN and BIOSYS_UN, respectively. Calling biosysR functions with invalid or empty, credentials, or defaulting to non existing BIOSYS_UN and BIOSYS_PW will fail with an informative error message prompting for correct authentication.

There are three ways to supply these authentication credentials to biosysR functions.

Permanent authentication

To set-and-forget BioSys authentication, add to your ~/.Rprofile:

Sys.setenv(BIOSYS_UN = "USERNAME")
Sys.setenv(BIOSYS_PW = "PASSWORD")

Every new R session will already contain these variables.

Session authentication

To authenticate one session, export BIOSYS_UN/PW as environment variables:

Sys.setenv(BIOSYS_UN = "USERNAME")
Sys.setenv(BIOSYS_PW = "PASSWORD")

Restarting the R session will clear these variables.

Per request authentication

Supply the variables to each biosysR function:

biosys_projects
biosys_datasets
biosys_records

projects <- biosys_projects(un="USERNAME", pw="PASSWORD")

Doing so will hand un and pw to biosys_get, which builds the authentication headers and uses them in the request to BioSys.

Data flow

Benthic images are analysed and annotated in the software EcoPAAS
EcoPAAS outputs data as Excel
BioSys imports Excel using a config created by Paul’s uploader (tm)
Biosys GUI
BioSys API in Browser using single sign on (needs active browser window)
BioSys API in scripts using basicauth (username and password)
BioSys API documentation

Accessing data from BioSys

Helper functions

Data is retrieved from the BioSys API through an HTTP GET with basicauth.
The JSON returned from the BioSys API is parsed into a tibble.
All heavy lifting is factored out into helper functions.

BioSys projects

projects <- biosys_projects()
dplyr::glimpse(projects)

## Observations: 7
## Variables: 13
## $ id                <chr> "1", "2", "3", "4", "7", "6", "5"
## $ name              <chr> "Berkeley Incidental Records", "Kimberley Is...
## $ code              <chr> "BER", "KI", "LCI", "KNC", "PRS", "SBS", "SCTI"
## $ description       <chr> "Incidental mainland records captured as par...
## $ site_count        <int> 41, 163, 208, 27, 104, 0, 64
## $ dataset_count     <int> 3, 9, 13, 10, 3, 2, 8
## $ record_count      <int> 154, 42118, 14561, 1621, 726, 3696, 3730
## $ longitude         <dbl> 127.8207, 125.5086, 126.8049, 128.5613, NA, ...
## $ latitude          <dbl> -14.48498, -14.60075, -15.54562, -16.08126, ...
## $ datum             <chr> "4326", "4326", "4326", "4326", "4326", "432...
## $ timezone          <chr> "Australia/Perth", "Australia/Perth", "Austr...
## $ site_data_package <list> [NULL, NULL, NULL, NULL, NULL, NULL, NULL]
## $ custodians        <list> [2, 2, 2, 2, 2, [2, 9, 12], 2]

DT::datatable(projects)

BioSys datasets

datasets <- biosys_datasets(project_id = 6)
dplyr::glimpse(datasets)

## Observations: 48
## Variables: 7
## $ id           <chr> "101", "107", "118", "30", "45", "99", "108", "11...
## $ record_count <int> 4582, 38, 3307, 33, 414, 426, 95, 42, 23, 1163, 6...
## $ data_package <list> [["tabular-data-package", "BioSys Config", "anim...
## $ name         <chr> "Animal Observations", "Animal Observations", "An...
## $ type         <chr> "species_observation", "species_observation", "sp...
## $ description  <chr> "", "", "", "", "", "", "", "", "", "", "", "", "...
## $ project_id   <int> 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 3, 3, 4, 5, 2...

listviewer::jsonedit(datasets$data_package)

DT::datatable(datasets)

## Warning in instance$preRenderHook(instance): It seems your data is too
## big for client-side DataTables. You may consider server-side processing:
## http://rstudio.github.io/DT/server.html

BioSys records

records <- biosys_records(project_id = 6)
DT::datatable(head(records, n = 100))

Example data

The example data in this project were produced by saving one project’s project metadata, dataset metadata, and a subset of records to the data/ folder.

projects <- biosys_projects()[6,]
datasets <- biosys_datasets(project_id = 6)[1,]
records <- biosys_records(project_id = 6) %>% head(n = 100)
devtools::use_data(projects, datasets, records)

BioSys

Florian Mayer

2017-09-25