fit.mptinr: Fit cognitive models for categorical data using an objective function

Description

Fitting function for package MPTinR. Can fit any model for categorical data specified in an objective function. Fitting can be enhanced with gradient and or Hessian. Predicted values will be added when a prediction function is present.

Usage

fit.mptinr(
	data,
	objective, 
	param.names,
	categories.per.type,
	gradient = NULL, use.gradient = TRUE,
	hessian = NULL, use.hessian = FALSE,
	prediction = NULL,
	n.optim = 5,
	fia.df = NULL,
	ci = 95, 
	starting.values = NULL,
	lower.bound = 0,
	upper.bound = 1,
	output = c("standard", "fia", "full"),
	fit.aggregated = TRUE,
	sort.param = TRUE,
	show.messages = TRUE,
	use.restrictions = FALSE,
	orig.params = NULL,
	restrictions = NULL,	
	multicore = c("none", "individual", "n.optim"), sfInit = FALSE, nCPU = 2,
	control = list(),
    numDeriv = TRUE,
	...
)

Arguments

data

Either a numeric vector for individual fit or a numeric matrix or data.frame for multi-dataset fit. The data on each position (column for multi-dataset fit) must correspond to the respective line in the model file. Fitting for multiple datasets can be parallelized via multicore.

objective

the objective function used for fitting. Needs to return a scalar likelihood value.

param.names

character vector giving the parameters present in the model. The order of this vector determines the order with which the output from the fitting routine is interpreted.

categories.per.type

numeric vector indicating how many response categories each item type has.

gradient

the gradient function used for fitting. Needs to return a vector of same length as param.names.

use.gradient

logical. indicating whether or not gradient should be used for fitting. Default is TRUE

hessian

the Hessian function used for fitting. Needs to return a matrix with dim = c(length(param.names), length(param.names).

use.hessian

logical. indicating whether or not hessian should be used for fitting. Default is FALSE

prediction

the prediction function. Needs to return a vector of equal length as the response categories or data. Needs to return probabilities!

n.optim

Number of optimization runs. Can be parallelized via multicore. Default is 5.

fia.df

needed for handling MPTs with computation of FIA coming from fit.mpt or fit.model. Do not use.

A scalar corresponding to the size of the confidence intervals for the parameter estimates. Default is 95 which corresponds to 95% confidence intervals.

starting.values

A vector, a list, or NULL (the default). If NULL starting values for parameters are randomly drawn from a uniform distribution with the interval (0.1 - 0.9). See Details for the other options.

lower.bound

numeric scalar or vector. Can be used in fit.model to set the lower bounds of the parameter space. See Details.

upper.bound

numeric scalar or vector. Can be used in fit.model to set the upper bounds of the parameter space. See Details.

output

If "full" fit.mpt will additionally return the return values of nlminb and the Hessian matrices. (If "fia", fit.mpt will additionally return the results from get.mpt.fia (if fia not equal NULL).)

fit.aggregated

logical. Only relevant for multiple datasets (i.e., matrix or data.frame). Should the aggregated dataset (i.e., data summed over rows) be additionally fitted? Default (TRUE) fits the aggregated data.

sort.param

Logical. If TRUE, parameters are alphabetically sorted in the parameter table. If FALSE, the first parameters in the parameter table are the non-restricted ones, followed by the restricted parameters. Default is TRUE.

show.messages

Logical. If TRUE the time the fitting algorithms takes is printed to the console.

use.restrictions

needed for handling MPTs coming from fit.mpt. Do not use, unless you are sure what you are doing.

orig.params

needed for handling models coming from fit.mpt or fit.model. Do not use, unless you are sure what you are doing.

restrictions

needed for handling models coming from fit.mpt or fit.model. Do not use, unless you are sure what you are doing.

multicore

Character vector. If not "none", uses snowfall for parallelization (which needs to be installed separately via install.packages(snowfall)). If "individual", parallelizes the optimization for each individual (i.e., data needs to be a matrix or data.frame). If "n.optim", parallelizes the n.optim optimization runs. Default is "none" which corresponds to no parallelization. Note that you need to initialize snowfall in default settings. See sfInit and Details.

sfInit

Logical. Relevant if multicore is not "none". If TRUE, fit.mpt will initialize and close the multicore support. If FALSE, (the default) assumes that sfInit() was initialized before. See Details.

nCPU

Scalar. Only relevant if multicore is not "none" and sfInit is TRUE. Number of CPUs used by snowfall. Default is 2.

control

list containing control arguments passed on to nlminb. See there.

numDeriv

logical. Should the Hessian matrix of the maximum likelihood estimates be estimated numerically using numDeriv::hessian in case it cannot be estimated analytically? This can be extremely time and memory consuming for larger models. Default is TRUE.

...

arguments passed on to the objective function, the gradient function, the hessian function and the prediction function.

Value

For individual fits (i.e., data is a vector) a list containing one or more of the following components from the best fitting model:

goodness.of.fit

A data.frame containing the goodness of fit values for the model. Log.Likelihood is the Log-Likelihood value. G.Squared, df, and p.value are the \(G^2\) goodness of fit statistic.

information.criteria

A data.frame containing model information criteria based on the \(G^2\) value. The FIA values(s) are presented if fia is not NULL.

model.info

A data.frame containing other information about the model. If the rank of the Fisher matrix (rank.fisher) does not correspond to the number of parameters in the model (n.parameters) this indicates a serious issue with the identifiability of the model. A common reason is that one of the parameter estimates lies on the bound of the parameter space (i.e., 0 or 1).

parameters

A data.frame containing the parameter estimates and corresponding confidence intervals. If a restriction file was present, the restricted parameters are marked.

data

A list of two matrices; the first one (observed) contains the entered data, the second one (predicted) contains the predicted values.

For multi-dataset fits (i.e., data is a matrix or data.frame) a list with similar elements, but the following differences: The first elements, goodness.of.fit, information.criteria, and model.info, contain the same information as for individual fits, but each are lists with three elements containing the respective values for: each individual in the list element individual, the sum of the individual values in the list element sum, and the values corresponding to the fit for the aggregated data in the list element aggregated. parameters is a list containing:

individual

A 3-dimensional array containing the parameter estimates ([,1,]), confidence intervals [,2:3,], and, if restrictions not NULL, column 4 [,4,] is 0 for non-restricted parameters, 1 for equality restricted parameters, and 2 for inequality restricted parameters. The first dimension refers to the parameters, the second to the information on each parameter, and the third to the individual/dataset.

mean

A data.frame with the mean parameter estimates from the individual estimates. No confidence intervals can be provided for these values.

aggregated

A data.frame containing the parameter estimates and corresponding confidence intervals for the aggregated data. If a restriction file was present, the restricted parameters are marked.

The element data contains two matrices, one with the observed, and one with the predicted data (or is a list containing lists with individual and aggregated observed and predicted data).

If n.optim > 1, the summary of the vector (matrix for multi-individual fit) containing the Log-Likelihood values returned by each run of optim is added to the output: fitting.runs

When output == "full" the list contains the additional items:

optim.runs

A list (or list of lists for multiple datasets) containing the outputs from all runs by nlminb (including those runs produced when fitting did not converge)

best.fits

A list (or list of lists for multiple datasets) containing the outputs from the runs by nlminb that had the lowest likelihood (i.e., the successful runs)

hessian

A list containing the Hessian matrix or matrices of the final parameter estimates.

Details

This functions can be used to fit any model for categorical data that can be specified via a (objective) function (i.e., especially models that are not MPTs). For fitting MPTs or other similar models such as SDTs see fit.mpt or fit.model.

The only mandatory arguments are: data, objective, param.names, and categories.per.type. Adding a function calculating the gradient will usually significantly speed up the fitting. However, in extreme cases (i.e., many empty cells) using the gradient can interfere with finding the global minima. Adding the function computing the hessian matrix is usually only useful for obtaining the accurate confidence intervals (usually the numerically estimated Hessian matrix is equivalent unless there are many empty cells or parameters at the boundary).

The objective (and gradient and hessian) function need to take as the first argument a numerical vector of length(param.names) representing the parameters. The other mandatory arguments for these functions are: data: A vector containing the data for the dataset being fitted. param.names: The character vector containing the parameter names is handed over to the objective. n.params: = length(param.names). To speed up computation the number of parameters is also handed over to the objective on each iteration. tmp.env: A environment (created with new.env). The objective function produced by fit.mpt assign the parameter values into this environment using the following statement: for (i in seq_len(n.params)) assign(param.names[i],Q[i], envir = tmp.env) Furthermore, fit.mptinr assigns the data points before fitting each dataset into tmp.env with the variables names hank.data.x where x is the ordinal number of that data point (i.e., position or column). In other words, you can use tmp.env to eval you model within this environment and access both parameters and data in it. lower.bound and upper.bound: both lower.bound and upper.bound will be passed on to the user-supplied functions as when nlminb fits without gradient it can try to use parameter values outside the bounds. This can be controlled with these arguments isnide the objective function.

Furthermore, note that all arguments passed via ... will be passed to objective, gradient, and hessian. And that these three functions need to take the same arguments. Furthermore gradient must return a vector as long as param.names and hessian must return a square matrix of order length(param.names). See nlminb for (slightly) more info.

Usage of gradient and/or hessian can be controlled with use.gradient and use.hessian.

prediction is a function similar to objective with the difference that it should return a vector of length sum(categories.per.type giving the probabilities for each item type. This function needs to take the same arguments as objective with the only exception that it does not take lower.bound and upper.bound (but ... is passed on to it).

Note that parameters names should not start with hank..

To set the starting values for the fitting process (e.g., to avoid local minima) one can set starting.values to a vector of length 2 and n.optim > 1. Then, starting values are randomly drawn from a uniform distribution from starting.values[1] to starting.values[2].

Alternatively, one can supply a list with two elements to starting.values. Both elements need to be either of length 1 or of length equal to the number of parameters (if both are of length 1, it is the same as if you supply a vector of length 2). For each parameter n (in alphabetical order), a starting value is randomly drawn from a uniform distribution starting.values[[1]][n] to starting.values[[2]][n] (if length is 1, this is the border for all parameters).

The least interesting option is to specify the starting values individually by supplying a vector with the same length as the number of parameters. Starting values must be ordered according to the alphabetical order of the parameter names. Use check.mpt for a function that returns the alphabetical order of the parameters. If one specifies the starting values like that, n.optim will be set to 1 as all other values would not make any sense (the optimization routine will produce identical results with identical starting values).

Multicore fitting is achieved via the snowfall package and needs to be initialized via sfInit. As initialization needs some time, you can either initialize multicore facilities yourself using sfInit() and setting the sfInit argument to FALSE (the default) or let MPTinR initialize multicore facilities by setting the sfInit argument to TRUE. The former is recommended as initializing snowfall takes some time and only needs to be done once if you run fit.mpt multiple times. If there are any problems with multicore fitting, first try to initialize snowfall outside MPTinR (e.g., sfInit( parallel=TRUE, cpus=2 )). If this does not work, the problem is not related to MPTinR but to snowfall (for support and references visit: http://www.imbi.uni-freiburg.de/parallel/). Note that you need to close snowfall via sfStop() after using MPTinR.

References

Kellen, D., Klauer, K. C., & Singmann, H. (2012). On the Measurement of Criterion Noise in Signal Detection Theory: The Case of Recognition Memory. Psychological Review. doi:10.1037/a0027727

Examples

Run this code

# NOT RUN {
# the example may occasionally fail due to a starting values - integration mismatch.

# Fit an SDT for a 4 alternative ranking task (Kellen, Klauer, & Singmann, 2012).

ranking.data <- structure(c(39, 80, 75, 35, 61, 54, 73, 52, 44, 63, 40, 48, 80,
49, 43, 80, 68, 53, 81, 60, 60, 65, 49, 58, 69, 75, 71, 47, 44,
85, 23, 9, 11, 21, 12, 21, 14, 20, 19, 15, 29, 13, 14, 15, 22,
11, 12, 16, 13, 20, 20, 9, 26, 19, 13, 9, 14, 15, 24, 9, 19,
7, 9, 26, 16, 14, 6, 17, 21, 14, 20, 18, 5, 19, 17, 5, 11, 21,
4, 9, 15, 17, 7, 17, 11, 11, 9, 19, 20, 3, 19, 4, 5, 18, 11,
11, 7, 11, 16, 8, 11, 21, 1, 17, 18, 4, 9, 10, 2, 11, 5, 9, 18,
6, 7, 5, 6, 19, 12, 3), .Dim = c(30L, 4L))

expSDTrank <- function(Q, param.names, n.params, tmp.env){
   
    e <- vector("numeric",4)

    mu <- Q[1]
    ss <- Q[2]
       
    G1<-function(x){
        ((pnorm(x)^3)*dnorm(x,mean=mu,sd=ss))
    }

    G2<-function(x){
        ((pnorm(x)^2)*dnorm(x,mean=mu,sd=ss)*(1-pnorm(x)))*3
    }

     G3<-function(x){
        (pnorm(x)*dnorm(x,mean=mu,sd=ss)*(1-pnorm(x))^2)*3
    }
 

    e[1] <- integrate(G1,-Inf,Inf,rel.tol = .Machine$double.eps^0.5)$value    
    e[2] <- integrate(G2,-Inf,Inf,rel.tol = .Machine$double.eps^0.5)$value
    e[3] <- integrate(G3,-Inf,Inf,rel.tol = .Machine$double.eps^0.5)$value
    e[4] <- 1-e[1]-e[2]-e[3]  
   
    return(e)
}



SDTrank <- function(Q, data, param.names, n.params, tmp.env, lower.bound, upper.bound){
   
    e<-vector("numeric",4)

    mu <- Q[1]
    ss <- Q[2]
       
    G1<-function(x){
        ((pnorm(x)^3)*dnorm(x,mean=mu,sd=ss))
    }

    G2<-function(x){
        ((pnorm(x)^2)*dnorm(x,mean=mu,sd=ss)*(1-pnorm(x)))*3
    }

     G3<-function(x){
        (pnorm(x)*dnorm(x,mean=mu,sd=ss)*(1-pnorm(x))^2)*3
    }
 

    e[1] <- integrate(G1,-Inf,Inf,rel.tol = .Machine$double.eps^0.5)$value    
    e[2] <- integrate(G2,-Inf,Inf,rel.tol = .Machine$double.eps^0.5)$value
    e[3] <- integrate(G3,-Inf,Inf,rel.tol = .Machine$double.eps^0.5)$value
    e[4] <- 1-e[1]-e[2]-e[3]  
   
    LL <- -sum(data[data!=0]*log(e[data!=0]))
    return(LL)
}

fit.mptinr(ranking.data, SDTrank, c("mu", "sigma"), 4, prediction = expSDTrank, 
		lower.bound = c(0,0.1), upper.bound = Inf)
 
# }

Run the code above in your browser using DataLab