estimate_prior: Estimate prior

Description

Estimate prior for Bayesian brain mapping based on fMRI data

Usage

estimate_prior(
  BOLD,
  BOLD2 = NULL,
  template,
  mask = NULL,
  inds = NULL,
  scale = c("local", "global", "none"),
  scale_sm_surfL = NULL,
  scale_sm_surfR = NULL,
  scale_sm_FWHM = 2,
  nuisance = NULL,
  scrub = NULL,
  drop_first = 0,
  hpf = 0,
  TR = NULL,
  GSR = FALSE,
  Q2 = 0,
  Q2_max = NULL,
  covariates = NULL,
  brainstructures = "all",
  resamp_res = NULL,
  keep_S = FALSE,
  keep_FC = FALSE,
  FC = TRUE,
  FC_nPivots = 100,
  FC_nSamp = 50000,
  varTol = 1e-06,
  maskTol = 0.1,
  missingTol = 0.1,
  usePar = FALSE,
  wb_path = NULL,
  verbose = TRUE
)
estimate_prior.cifti(
  BOLD,
  BOLD2 = NULL,
  template,
  inds = NULL,
  scale = c("local", "global", "none"),
  scale_sm_surfL = NULL,
  scale_sm_surfR = NULL,
  scale_sm_FWHM = 2,
  nuisance = NULL,
  scrub = NULL,
  drop_first = 0,
  hpf = 0,
  TR = NULL,
  GSR = FALSE,
  Q2 = 0,
  Q2_max = NULL,
  brainstructures = "all",
  resamp_res = resamp_res,
  keep_S = FALSE,
  keep_FC = FALSE,
  FC = TRUE,
  varTol = 1e-06,
  maskTol = 0.1,
  missingTol = 0.1,
  usePar = FALSE,
  wb_path = NULL,
  verbose = TRUE
)
estimate_prior.gifti(
  BOLD,
  BOLD2 = NULL,
  template,
  inds = NULL,
  scale = c("local", "global", "none"),
  scale_sm_surfL = NULL,
  scale_sm_surfR = NULL,
  scale_sm_FWHM = 2,
  nuisance = NULL,
  scrub = NULL,
  drop_first = 0,
  hpf = 0,
  TR = NULL,
  GSR = FALSE,
  Q2 = 0,
  Q2_max = NULL,
  brainstructures = "all",
  keep_S = FALSE,
  keep_FC = FALSE,
  FC = TRUE,
  varTol = 1e-06,
  maskTol = 0.1,
  missingTol = 0.1,
  usePar = FALSE,
  wb_path = NULL,
  verbose = TRUE
)
estimate_prior.nifti(
  BOLD,
  BOLD2 = NULL,
  template,
  inds = NULL,
  scale = c("local", "global", "none"),
  nuisance = NULL,
  scrub = NULL,
  drop_first = 0,
  hpf = 0,
  TR = NULL,
  GSR = FALSE,
  Q2 = 0,
  Q2_max = NULL,
  mask = NULL,
  keep_S = FALSE,
  keep_FC = FALSE,
  FC = TRUE,
  varTol = 1e-06,
  maskTol = 0.1,
  missingTol = 0.1,
  usePar = FALSE,
  wb_path = NULL,
  verbose = TRUE
)

Value

A list: the prior and var_decomp with entries in matrix format; the mask of locations without prior values due to too many low variance or missing values; the function params such as the type of scaling and detrending performed; the dat_struct which can be used to convert prior and var_decomp to "xifti" or "nifti" objects if the BOLD format was CIFTI or NIFTI data; and DR results if isTRUE(keep_S) and/or isTRUE(keep_FC).

Use summary to print a description of the prior results, and for CIFTI-format data use plot to plot the prior mean and variance estimates. Use export_prior to save the priors to individual RDS, CIFTI, or NIFTI files (depending on the BOLD format).

Arguments

BOLD, BOLD2

Vector of subject-level fMRI data in one of the following formats: CIFTI file paths, "xifti" objects, GIFTI file paths, "gifti" objects, NIFTI file paths, "nifti" objects, or \(V \times T\) numeric matrices, where \(V\) is the number of data locations and \(T\) is the number of timepoints.

If BOLD2 is provided it must be in the same format as BOLD; BOLD will be the test data and BOLD2 will be the retest data. BOLD2 should be the same length as BOLD and have the same subjects in the same order. If BOLD2 is not provided, BOLD will be split in half; the first half will be the test data and the second half will be the retest data.

template

Group-level template: either a group ICA (GICA), or a parcellation.

A GICA should be provided as a format compatible with BOLD, or a (vectorized) numeric matrix (\(V \times Q\)). Its columns will be centered.

A parcellation must be in CIFTI format for use with CIFTI BOLD data (other formats to be implemented in the future). The parcellation should have the same locations as the BOLD and one column, with integer values indicating the parcel to which each location belongs to. Each parcel is modeled as a brain map; instead of the first step of dual regression, the medial timecourse of each parcel is used.

mask

Required if BOLD are NIFTI file paths or "nifti" objects, and optional for other formats. For NIFTI data, this is a logical array of the same spatial dimensions as the fMRI data, with TRUE corresponding to in-mask voxels. For other data, this is a logical vector with the same length as the number of locations in template, with TRUE corresponding to in-mask locations.

inds

Numeric indices of the networks in template to include in the prior. If NULL, use all of the original networks (default).

If inds is provided, the networks not included will be removed after calculating dual regression, not before. This is because removing the networks prior to dual regression would leave unmodelled signals in the data, which could bias the priors.

scale

"local" (default), "global", or "none". Local scaling will divide each data location's time series by its estimated standard deviation. Global scaling will divide the entire data matrix by the mean image standard deviation (mean(sqrt(rowVars(BOLD)))).

scale_sm_surfL, scale_sm_surfR, scale_sm_FWHM

Only applies if scale=="local" and BOLD represents surface data (CIFTI or GIFTI). To smooth the standard deviation estimates used for local scaling, provide the surface geometries along which to smooth as GIFTI geometry files or "surf" objects, as well as the smoothing FWHM (default: 2).

If scale_sm_FWHM==0, no smoothing of the local standard deviation estimates will be performed.

If scale_sm_FWHM>0 but scale_sm_surfL and scale_sm_surfR are not provided, the default inflated surfaces from the HCP will be used.

To create a "surf" object from data, see make_surf. The surfaces must be in the same resolution as the BOLD data.

nuisance

(Optional) Nuisance matrices to regress from the BOLD data. Should be a list of matrices, with time along the rows and nuisance signals along the columns, where each entry corresponds to a BOLD session; or, if BOLD2 is provided, should be a length-2 list of such lists with the first sublist corresponding to BOLD and the second sublist corresponding to BOLD2. If NULL, do not remove any nuisance signals.

Nuisance regression is performed in a simultaneous regression with any spike regressors from scrub and DCT bases from hpf.

Note that the nuisance matrices should be provided with timepoints matching the original BOLD and BOLD2 irregardless of drop_first. Nuisance matrices will be truncated automatically if drop_first>0.

scrub

(Optional) Numeric vectors of integers giving the indices of volumes to scrub from the BOLD data. (List the volumes to remove, not the ones to keep.) Should be a list of such vectors, where each entry corresponds to a BOLD session; or, if BOLD2 is provided, should be a length-2 list of such lists with the first sublist corresponding to BOLD and the second sublist corresponding to BOLD2. If NULL (default), do not scrub.

Scrubbing is performed within a nuisance regression by adding a spike regressor to the nuisance design matrix for each volume to scrub.

Note that indices are counted beginning with the first index in the BOLD session irregardless of drop_first. The indices will be adjusted automatically if drop_first>0.

drop_first

(Optional) Number of volumes to drop from the start of each BOLD session. Default: 0.

hpf

The frequency at which to apply a highpass filter to the data during pre-processing, in Hertz. Default: 0 Hz (disabled). If the data has not already been highpass filtered, a recommended filter value is .01 Hz. The highpass filter serves to detrend the data, since low-frequency variance is associated with noise. Highpass filtering is accomplished by nuisance regression of discrete cosine transform (DCT) bases.

Note the TR argument is required for highpass filtering. If TR is not provided, hpf will be ignored.

TR

The temporal resolution of the data, i.e. the time between volumes, in seconds. TR is required for detrending with hpf.

GSR

Center BOLD across columns (each image)? This is equivalent to performing global signal regression. Default: FALSE.

Q2, Q2_max

Obtain dual regression estimates after denoising? Denoising is based on modeling and removing nuisance ICs. It may result in a cleaner estimate for smaller datasets, but it may be unnecessary (and time-consuming) for larger datasets.

Set Q2 to control denoising: use a positive integer to specify the number of nuisance ICs, NULL to have the number of nuisance ICs estimated by PESEL, or zero (default) to skip denoising.

If is.null(Q2), use Q2_max to specify the maximum number of nuisance ICs that should be estimated by PESEL. Q2_max must be less than \(T * .75 - Q\) where \(T\) is the minimum number of timepoints in each fMRI scan and \(Q\) is the number of networks in the template. If NULL (default), Q2_max will be set to \(T * .50 - Q\), rounded.

covariates

Subjects by variables numeric matrix of covariates to take into account for model estimation. Column names should give the name of each variable. Default: NULL (no covariates). NOTE: Not implemented yet.

brainstructures

Only applies if the entries of BOLD are CIFTI file paths. This is a character vector indicating which brain structure(s) to obtain: "left" (left cortical surface), "right" (right cortical surface) and/or "subcortical" (subcortical and cerebellar gray matter). Can also be "all" (obtain all three brain structures). Default: c("all").

resamp_res

Only applies if the entries of BOLD are CIFTI file paths. Resample the data upon reading it in? Default: NULL (no resampling).

keep_S

Keep the DR estimates of S? If FALSE (default), do not save the DR estimates and only return the priors. If TRUE, the DR estimates of S are returned too. If a single file path, save the DR estimates as an RDS file at that location rather than returning them.

keep_FC

Keep the DR estimates of the FC cor(A)? If FALSE (default), do not save the DR estimates and only return the priors. If TRUE, the DR estimates of cor(A) and its Cholesky factor are returned too. If a single file path, save the DR estimates as an RDS file at that location rather than returning them.

FC

Include the functional connectivity prior? Default: TRUE.

FC_nPivots

Number of pivots to use in Cholesky-based FC prior estimation. Set to zero to skip Cholesky-based FC prior estimation. Default: 100.

FC_nSamp

Number of FC matrix samples to generate across all pivots. This should be a multiple of FC_nPivots.

varTol

Tolerance for variance of each data location. For each scan, locations which do not meet this threshold are masked out of the analysis. Default: 1e-6. Variance is calculated on the original data, before any normalization.

maskTol

For computing the dual regression results for each subject: tolerance for number of locations masked out due to low variance or missing values. If more than this many locations are masked out, a subject is skipped without calculating dual regression. maskTol can be specified either as a proportion of the number of locations (between zero and one), or as a number of locations (integers greater than one). Default: .1, i.e. up to 10 percent of locations can be masked out.

If BOLD2 is provided, masks are calculated for both scans and then the intersection of the masks is used, for each subject.

missingTol

For computing the variance decomposition across all subjects: tolerance for number of subjects masked out due to low variance or missing values at a given location. If more than this many subjects are masked out, the location's value will be NA in the priors. missingTol can be specified either as a proportion of the number of locations (between zero and one), or as a number of locations (integers greater than one). Default: .1, i.e. up to 10 percent of subjects can be masked out at a given location.

usePar, wb_path

Parallelize the DR computations over subjects? Default: FALSE. Can be the number of cores to use or TRUE, which will use the number on the PC minus two. If the input data is in CIFTI format, the wb_path must also be provided.

verbose

Display progress updates? Default: TRUE.

Details

All fMRI data (entries in BOLD and BOLD2, and template) must be in the same spatial resolution.

Examples

Run this code

nT <- 21
nV <- 140
nQ <- 6
mU <- matrix(rnorm(nV*nQ), nrow=nV)
mS <- mU %*% diag(seq(nQ, 1)) %*% matrix(rnorm(nQ*nT), nrow=nQ)
BOLD <- list(B1=mS, B2=mS, B3=mS)
BOLD <- lapply(BOLD, function(x){x + rnorm(nV*nT, sd=.05)})
template <- mU
estimate_prior(BOLD=BOLD, template=mU, FC_nSamp=2000, usePar=FALSE)

if (FALSE) {
 estimate_prior(
   run1_cifti_fnames, run2_cifti_fnames,
   gICA_cifti_fname, brainstructures="all",
   scale="global", TR=0.71, Q2=NULL, varTol=10,
   usePar=FALSE
 )
}

Run the code above in your browser using DataLab