multiassoc()
take a data frame with distribution(s) of PGS and Phenotype(s),
and a table of associations to make from this data frame
return a data frame showing the association results
multiassoc(
df = NULL,
assoc_table = NULL,
scale = TRUE,
covar_col = NA,
verbose = TRUE,
log = "",
parallel = FALSE,
num_cores = NA
)
return a data frame showing the association of the PGS(s) on the Phenotype(s) with the following columns:
PGS: the name of the PGS
Phenotype: the name of Phenotype
Phenotype_type: either 'Continuous'
, 'Ordered Categorical'
, 'Categorical'
or 'Cases/Controls'
Stat_method: association function detects what is the phenotype type and what is the best way to analyse it, either 'Linear regression'
, 'Binary logistic regression'
, 'Ordinal logistic regression'
or 'Multinomial logistic regression'
Covar: list all the covariates used for this association
N_cases: if Phenotype_type is Cases/Controls, gives the number of cases
N_controls: if Phenotype_type is Cases/Controls, gives the number of controls
N: the number of individuals/samples
Effect: if Phenotype_type is Continuous, it represents the Beta coefficient of linear regression, OR of logistic regression otherwise
SE: standard error of the related Effect (Beta or OR)
lower_CI: lower confidence interval of the related Effect (Beta or OR)
upper_CI: upper confidence interval of the related Effect (Beta or OR)
P_value: associated P-value
a dataframe with individuals on each row, and at least the following columns:
one ID column,
one PGS column, with numerical continuous values following a normal distribution,
one Phenotype column, can be numeric (Continuous Phenotype), character, boolean or factors (Discrete Phenotype)
a dataframe or matrix specifying the associations to make from df, with 2 columns: PGS and Phenotype (in this order)
a boolean specifying if scaling of PGS should be done before testing
a character vector specifying the covariate column names (facultative)
a boolean (TRUE by default) to write in the console/log messages.
a connection, or a character string naming the file to print to. If "" (by default), it prints to the standard output connection, the console unless redirected by sink. If parallel = TRUE, the log will be incomplete
a boolean, if TRUE, multiassoc()
parallelise the association analysis to run it faster (no log available with this option, does not work with Windows machine)
If FALSE (default), the association analysis will not be parallelised (useful for debugging process)
an integer, if parallel = TRUE (default), multiassoc()
parallelise the association analysis to run it faster using num_cores as the number of cores.
If nothing is provided, it detects the number of cores of the machine and use num_cores-1
assoc_table <- expand.grid(
c("t2d_PGS", "ldl_PGS"),
c("ethnicity","brc","t2d","log_ldl","sbp_cat")
)
results <- multiassoc(
df = comorbidData,
assoc_table = assoc_table,
covar_col = c("age", "sex", "gen_array"),
parallel = FALSE,
verbose = FALSE
)
print(results)
Run the code above in your browser using DataLab