Run a list of predictions
runPlpAnalyses(
connectionDetails,
cdmDatabaseSchema,
cdmDatabaseName,
oracleTempSchema = cdmDatabaseSchema,
cohortDatabaseSchema = cdmDatabaseSchema,
cohortTable = "cohort",
outcomeDatabaseSchema = cdmDatabaseSchema,
outcomeTable = "cohort",
cdmVersion = 5,
onlyFetchData = FALSE,
outputFolder = "./PlpOutput",
modelAnalysisList,
cohortIds,
cohortNames,
outcomeIds,
outcomeNames,
washoutPeriod = 0,
maxSampleSize = NULL,
minCovariateFraction = 0,
normalizeData = T,
testSplit = "person",
testFraction = 0.25,
splitSeed = NULL,
nfold = 3,
verbosity = "INFO",
settings = NULL
)
An R object of type connectionDetails
created using the
function createConnectionDetails
in the
DatabaseConnector
package.
The name of the database schema that contains the OMOP CDM instance. Requires read permissions to this database. On SQL Server, this should specifiy both the database and the schema, so for example 'cdm_instance.dbo'.
A string with a shareable name of the database (this will be shown to OHDSI researchers if the results get transported)
For Oracle only: the name of the database schema where you want all temporary tables to be managed. Requires create/insert permissions to this database.
The name of the database schema that is the location where the target cohorts are available. Requires read permissions to this database.
The tablename that contains the target cohorts. Expectation is cohortTable has format of COHORT table: COHORT_DEFINITION_ID, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
The name of the database schema that is the location where the data used to define the outcome cohorts is available. Requires read permissions to this database.
The tablename that contains the outcome cohorts. Expectation is outcomeTable has format of COHORT table: COHORT_DEFINITION_ID, SUBJECT_ID, COHORT_START_DATE, COHORT_END_DATE.
Define the OMOP CDM version used: currently support "4" and "5".
Only fetches and saves the data object to the output folder without running the analysis.
Name of the folder where all the outputs will written to.
A list of objects of type modelSettings
as created using
the createPlpModelSettings
function.
A vector of cohortIds that specify all the target cohorts
A vector of cohortNames corresponding to the cohortIds
A vector of outcomeIds that specify all the outcome cohorts
A vector of outcomeNames corresponding to the outcomeIds
Minimum number of prior observation days
Max number of target people to sample from to develop models
Any covariate with an incidence less than this value if ignored
Whether to normalize the covariates
How to split into test/train (time or person)
Fraction of data to use as test set
The seed used for the randomization into test/train
Number of folds used to do cross validation
The logging level
Specify the T, O, population, covariate and model settings
A data frame with the following columns:
analysisId |
The unique identifier for a set of analysis choices. |
cohortId |
The ID of the target cohort populations. |
outcomeId |
The ID of the outcomeId. |
plpDataFolder |
The location where the plpData was saved |
studyPopFile |
The name of the file containing the study population |
evaluationFolder |
The name of file containing the evaluation saved as a csv |
modelFolder |
The name of the file containing the developed model. |
Run a list of predictions for the target cohorts and outcomes of interest. This function will run all specified predictions, meaning that the total number of outcome models is `length(cohortIds) * length(outcomeIds) * length(modelAnalysisList)`.