The DataSHIELD Interface (DSI) defines a set of S4 classes and generic methods that can be implemented for accessing a data repository supporting the DataSHIELD infrastructure: controlled R commands to be executed on the server side are garanteeing that non disclosive information is returned to client side.
Learn more about DataSHIELD.
The DSI classes are:
-
DSObject
a common base class for all DSI, -
DSDriver
drives the creation of a connection object, -
DSConnection
allows the interaction with the remote server; DataSHIELD operations such as aggregation and assignment return a result object; DataSHIELD setup status check can be performed (dataset access, configuration comparision), -
DSResult
wraps access to the result, which can be fetched either synchronously or asynchronously depending on the capabilities of the data repository server.
All classes are virtual: they cannot be instantiated directly and instead must be subclassed. See DSOpal for a reference implementation of DSI based on the Opal data warehouse. See also DSLite for a server-less implementation of DSI for local datasets.
These S4 classes and generic methods are meant to be used for implementing connection to a DataSHIELD-aware data repository.
In addition to these S4 classes, DSI provides functions to handle a list of remote data repository servers:
-
datashield.login
anddatashield.logout
will make use of theDSDriver
paradigm to createDSConnection
s to the data repositories, -
datashield.aggregate
anddatashield.assign
will perform typical DataSHIELD operations onDSConnection
s, which result will be fetched throughDSResult
objects, -
datashield.connections
,datashield.connections_default
anddatashield.connections_find
are functions for managing the list ofDSConnection
objects that will be discovered and used by the client-side analytic functions. -
datashield.errors
will report the last R errors that may have occurred after adatashield.assign
ordatashield.aggregate
call. - Other data management functions are provided by the
DSConnection
objects:-
datashield.workspaces
,datashield.workspace_save
,datashield.workspace_restore
anddatashield.workspace_rm
allow to manage R images of the remote DataSHIELD sessions (to speed up data analysis sessions), -
datashield.symbols
anddatashield.symbol_rm
offer a minimalistic management of the R symbols living in the remote DataSHIELD sessions, -
datashield.tables
,datashield.table_status
list the tables and their accessibility across a set of data repositories, -
datashield.resources
,datashield.resource_status
list the resources and their accessibility across a set of data repositories, -
datashield.pkg_status
,datashield.method_status
anddatashield.methods
are utility functions to explore the DataSHIELD setup across a set of data repositories. -
datashield.profiles
lists the DataSHIELD profiles that can be selected at login time.
-
These datashield.*
functions are meant to be used by DataSHIELD packages developers and users.
Some options can be set to modify the behavior of the DSI:
-
datashield.env
is the R environment in which theDSConnection
object list is to be looking for. Default value is the Global Environment:globalenv()
. -
datashield.progress
is a logical to enable the visibility of the progress bars. Default value isTRUE
. -
datashield.progress.clear
is a logical to make the progress bar disappear after it has been completed. Default value isFALSE
. -
datashield.errors.stop
is a logical to alter error handling behavior: ifTRUE
an error is raised when at least one server has failed, otherwise a warning message is issued. Default value isTRUE
. -
datashield.errors.print
is a logical for controlling the error print in the console: ifTRUE
the errors are automatically printed in rich text, otherwise subsequent call todatashield.errors()
is required to get the details of the errors. Default value isFALSE
. -
datashield.polling.sleep.0
time in seconds to wait before checking async calls completion, before ~1 second. Default value 50 milliseconds. -
datashield.polling.sleep.1
base time in seconds to wait before checking async calls completion, after ~1 second. Default value is 1 second. -
datashield.polling.sleep.10
time in seconds to wait before checking async calls completion, after ~10 seconds. Default value is x2 the base time (2 seconds). -
datashield.polling.sleep.60
time in seconds to wait before checking async calls completion, after ~1 minute. Default value is x10 the base time (10 seconds). -
datashield.polling.sleep.600
time in seconds to wait before checking async calls completion, after ~10 minutes. Default value is x60 the base time (1 minute). -
datashield.polling.sleep.3600
time in seconds to wait before checking async calls completion, after ~1 hour. Default value is x600 the base time (10 minutes).