This function performs customized fdr analyses tailored to each individual cases.
g2l.proc(X, z, X.target = NULL, z.target = NULL, m = c(4, 6), alpha = 0.1,
nbag = NULL, nsample = length(z), lp.reg.method = "lm",
null.scale = "QQ", approx.method = "direct", ngrid = 2000,
centering = TRUE, coef.smooth = "BIC", fdr.method = "locfdr",
plot = TRUE, rel.null = "custom", locfdr.df = 10,
fdr.th.fixed = NULL, parallel = FALSE, ...)A \(n\)-by-\(d\) matrix of covariate values
A length \(n\) vector containing observations of z values.
A \(k\)-by-\(d\) matrix providing \(k\) sets of covariates for target cases to investigate. Set to NULL to investigate all cases and provide global inference results.
A vector of length \(k\), providing the target \(z\) values to investigate
An ordered pair. First number indicates how many LP-nonparametric basis to construct for each \(X\), second number indicates how many to construct for \(z\). Default: m=c(4,6).
Confidence level for determining signals.
Number of bags of parametric bootstrapped samples to use for each target case, each time a new set of relevance samples will be generated for analysis, and the resulting fdr curves are aggregated together by taking the mean values. Set to NULL to disable.
Number of relevance samples generated for each case. The default is the size of the input z-statistic.
Method for estimating the relevance function and its conditional LP-Fourier coefficients. We currently support three options: lm (inbuilt with subset selection), glmnet, and knn.
Method of estimating null standard deviation from the laser samples. Available options: "IQR", "QQ" and "locfdr"
Method used to approximate customized fdr curve, default is "direct".When set to "indirect", the customized fdr is computed by modifying pooled fdr using relevant density function.
Number of gridpoints to use for computing customized fdr curve.
Whether to perform regression-adjustment to center the data, default is TRUE.
Specifies the method to use for LP coefficient smoothing (AIC or BIC). Uses BIC by default.
Method for controlling false discoveries (either "locfdr" or "BH"), default choice is "locfdr".
Whether to include plots in the results, default is TRUE.
How the relevant null changes with x: "custom" denotes we allow it to vary with x, and "th" denotes fixed.
Degrees of freedom to use for locfdr()
Use fixed fdr threshold for finding signals. Default set to NULL, which finds different thresholds for different cases.
Use parallel computing for obtaining the relevance samples, mainly used for very huge nsample, default is FALSE.
Extra parameters to pass to other functions. Currently only supports the arguments for knn().
A list containing the following items:
Available when X.target set to NULL, contains the following items:
A list of global inference results:
Matrix of covariates, same as input X.
Vector of observations, same as input z.
A vector of length \(n\), indicating how likely the observed z belongs to local null.
A binary vector of length \(n\), discoveries are indicated by \(1\).
A list of plots for global inference:
A plot of signals discovered, marked in red
A scatterplot of z on x, colored based on the discovery propensity scores, only available when fdr.method = "locfdr".
A scatterplot of discovery propensity scores on x, only available when fdr.method = "locfdr".
Available when X.target are provided with values, contains the following items:
Customized estimates for null probabilities for target \(X\) and \(z\)
A binary vector of length \(k\), discoveries in the target cases are indicated by \(1\)
Pooled global estimates for null probabilities for target \(X\) and \(z\)
Customized fdr plots for the target cases.
m.lpSame as input m
Mukhopadhyay, S., and Wang, K (2021) "On The Problem of Relevance in Statistical Inference". <arXiv:2004.09588>
# NOT RUN {
data(funnel)
X<-funnel$x
z<-funnel$z
##macro-inference using locfdr and LASER:
g2l_macro<-g2l.proc(X,z)
g2l_macro$macro$plots
#Microinference for the DTI data: case A with x=(18,55) and z=3.95
data(data.dti)
X<- cbind(data.dti$coordx,data.dti$coordy)
z<-data.dti$z
g2l_x<-g2l.proc(X,z,X.target=c(18,55),z.target=3.95,nsample =3000)
g2l_x$micro$plots$fdr.1+ggplot2::coord_cartesian(xlim=c(0,4))
g2l_x$micro$result[4]
# }
Run the code above in your browser using DataLab