CPMCGLM(formula, family, link, data, varcod, dicho, nb.dicho, categ,
nb.categ, boxcox, nboxcox, N=1000, cutpoint)- nb.dicho: Dichotomous transformations include only the categorical transformations in two classes. The most natural method is to use a transformation based on the quantile. For one transformation, the median is used as a cutpoint for the dichotomous coding. For two transformations, the first tercile is used for the first dichotomous transformation, and the second tercile for the second one, and so on.
- categ: The categ argument needs to be a matrix. You need to have one line per transformation. Therefore, the dimension of the matrix is nbq $\times$ maxq, where nbq is the number of transformations tried with the categ transformations, and maxq is the maximum of number of quantiles that is used in one quantile transformation.
For example:
| [1,] | 0.33 | 0.66 | NA |
| NA | [2,] | 0.25 | 0.5 |
| 0.75 | NA | [3,] | 0.2 |
| 0.4 | 0.6 | 0.8 | [1,] |
In this example, three transformations are performed, so nbq=3. And maxq=4, because the maximum of number of quantiles that we used for the quintiles is 4. The first transformation leads to a categorical transformation in three classes, with cutpoints at the first and the second tercile. The second transformation allows to obtain a categorical variable in four classes with cutpoints at the quartile. And the third one allows to obtain a variable in five classes with the cutpoints at the quintiles.
- nb.categ: This concerns categorical transformations in more than two classes. Considering one of these transformations, the most intuitive method is to use a transformation in three classes at the tercile. For two of such transformations, we added the previous coding and a categorical transformation in four classes based on the quartile, and so on.
- cutpoint: The cutpoint argument needs to be a matrix. The form of this matrix is similar as one of the quantile matrix. The number of rows corresponds to the number of tranformations (nbc) tried with this method, and the number of columns corresponds to the maximum of cutpoints (maxc) that is used in one transformation.
For example:
| [1,] | 8 | 16 | NA |
| NA | [2,] | 6 | 12 |
| 18 | NA | [3,] | 5 |
| 10 | 15 | 20 | [1,] |
In this example,one wants to perform three transformations, hence the three rows. The first transformation leads to a categorical variable in three classes, with two cutpoints for the value "8", and the value "16". The second transformation allows to obtain a categorical transformation in four classes, with cutpoints for values: "6","12" and "18". The last transformation tried allows to obtain a categorical transformation in five classes with cutpoint for values: "5","10","15", and "20". Therefore, we used four columns because four is the maximum of cutpoints used, in the third transformation.
Liquet, B. and Commenges, D. (2001). Correction of the p-value after multiple coding of anexplanatory variable in logistic regression. Statistics in Medicine, 20:2815-2826.
Westfall, P. H. and Young, S. (1992). Resampling-based multiple testing: examples and methods for pvalue adjustment. Wiley Series in Probability and Mathematical Statistics. Applied Probability and Statistics. New York, NY: Wiley. xvii, 340 p.
Yu, K., Liang, F., Ciampa, J., and Chatterjee, N. (2011). Efficient p-value evaluation for resampling-based tests. Biostatistics, 12(3):582-593.
print.CPMCGLM, summary.CPMCGLM
## Not run:
# # load data
# data(data_sim)
# #
# #Example of quantile matrix definition
#
# #Linear Gaussian Model
#
# fit1 <- CPMCGLM(formula= Weight~Age+as.factor(Sport)+Desease+Height,
# family="gaussian",link="identity",data=data_sim,varcod="Age",N=1000,
# boxcox=c(0,1,2,3),nb.dicho=3,nb.categ=4)
# ### print fit1
# fit1
# ### summary fit1
# summary(fit1)
#
# #Loglinear Poisson Model
# fit2 <- CPMCGLM(formula= Stroke~Age+as.factor(Sport)+Height+Weight,
# family="poisson",link="log",data=data_sim,varcod="Age",N=1000,
# boxcox=c(0,1,2,3))
#
# ### print fit2
# fit2
# ### summary fit2
# summary(fit2)
#
# #Logit Model
# fit3 <- CPMCGLM(formula= Parameter~Age+as.factor(Sport)+Height+Weight,
# family="binomial",link="logit",data=data_sim,varcod="Age",N=1000,
# boxcox=c(0,1,2,3),nb.dicho=3)
# ### print fit3
# fit3
# ### summary fit3
# summary(fit3)
#
# #Probit Model
# fit4 <- CPMCGLM(formula= Parameter~Age+as.factor(Sport)+Height+Weight,
# family="binomial",link="probit",data=data_sim,varcod="Age",N=1000,
# nboxcox=2,nb.categ=4)
# ### print fit4
# fit4
# ### summary fit4
# summary(fit4)
# ## End(Not run)
Run the code above in your browser using DataLab