est_bilog: Run BILOG-MG in batch mode

Description

est_bilog runs BILOG-MG in batch mode or reads BILOG-MG output generated by BILOG-MG program. In the first case, this function requires BILOG-MG already installed on your computer under bilog_exe_folder directory.

In the latter case, where appropriate BILOG-MG files are present (i.e. "<analysis_name>.PAR", "<analysis_name>.PH1", "<analysis_name>.PH2" and "<analysis_name>.PH3" files exist) and overwrite = FALSE, there is no need for BILOG-MG program. This function can read BILOG-MG output without BILOG-MG program.

Usage

est_bilog(
  x = NULL,
  model = "3PL",
  target_dir = getwd(),
  analysis_name = "bilog_calibration",
  items = NULL,
  id_var = NULL,
  group_var = NULL,
  logistic = TRUE,
  num_of_alternatives = NULL,
  criterion = 0.01,
  num_of_quadrature = 81,
  max_em_cycles = 100,
  newton = 20,
  reference_group = NULL,
  scoring_options = c("METHOD=1", "NOPRINT"),
  calib_options = c("NORMAL"),
  overwrite = FALSE,
  show_output_on_console = TRUE,
  bilog_exe_folder = file.path("C:/Program Files/BILOGMG")
)

Arguments

Either a data.frame or matrix object. When the data is not necessary, i.e. user only wants to read the BILOG-MG output from the target_dir, then this can be set to NULL.

model

The model of the items. The value is one of the following:

"1PL": One-parameter logistic model.
"2PL": Two-parameter logistic model.
"3PL": Three-parameter logistic model.
"CTT": Return only Classical Test theory statistics such as p-values, point-biserial and biserial correlations.

The default value is "3PL".

target_dir

The directory/folder where the BILOG-MG analysis and data files will be saved. The default value is the current working directory, i.e. get_wd().

analysis_name

A short file name that will be used for the data files created for the analysis.

items

A vector of column names or numbers of the x that represents the responses. If, in the syntax file, no entry for item names are desired, then, simply write items = "none".

id_var

The column name or number that contains individual subject IDs. If none is provided (i.e. id_var = NULL), the program will check whether the data provided has row names.

group_var

The column name or number that contains group membership information if multi-group calibration is desired. Ideally, it grouping variable is represented by single digit integers. If other type of data provided, an integer value will automatically assigned to the variables. The default value is NULL, where no multi-group analysis will be performed.

logistic

A logical value. If TRUE, LOGISTIC keyword will be added to the BILOG-MG command file which means the calibration will assume the natural metric of the logistic response function in all calculations. If FALSE, the logit is multiplied by D = 1.7 to obtain the metric of the normal-ogive model. The default value is TRUE.

num_of_alternatives

An integer specifying the maximum number of response alternatives in the raw data. 1/num_of_alternatives is used by the analysis as automatic starting value for estimating the pseudo-guessing parameters.

The default value is NULL. In this case, for 3PL, 5 will be used and for 1PL and 2PL, 1000 will be used.

This value will be represented in BILOG-MG control file as: NALT = num_of_alternatives.

criterion

Convergence criterion for EM and Newton iterations. The default value is 0.01.

num_of_quadrature

The number of quadrature points in MML estimation. The default value is 81. This value will be represented in BILOG-MG control file as: NQPT = num_of_quadrature. The BILOG-MG default value is 20 if there are more than one group, 10 otherwise.

max_em_cycles

An integer (0, 1, ...) representing the maximum number of EM cycles. This value will be represented in BILOG-MG control file as: CYCLES = max_em_cycles. The default value is 10.

newton

An integer (0, 1, ...) representing the number of Gauss-Newton iterations following EM cycles. This value will be represented in BILOG-MG control file as: NEWTON = newton.

reference_group

Represent which group's ability distribution will be set to mean = 0 and standard deviation = 1. For example, if the value is 1, then the group whose code is 1 will have ability distribution with mean 0 and standard deviation 1. When groups are assumed to coming from a single population, set this value to 0.

The default value is NULL.

This value will be represented in BILOG-MG control file as: REFERENCE = reference_group.

scoring_options

A string vector of keywords/options that will be added to the SCORE section in BILOG-MG syntax. Set the value of scoring_options to NULL if scoring of individual examinees is not necessary.

The default value is c("METHOD=1", "NOPRINT") where scale scores will be estimated using Maximum Likelihood estimation and the scoring process will not be printed to the R console (if show_output_on_console = TRUE).

The main option to be added to this vector is "METHOD=n". Following options are available:

"METHOD=1": Maximum Likelihood (ML)
"METHOD=2": Expected a Posteriori (EAP)
"METHOD=3": Maximum a Posteriori (MAP)

In addition to "METHOD=n" keyword, following keywords can be added:

"NOPRINT": Suppresses the display of the scores on the R console.

"FIT": likelihood ratio chi-square goodness-of-fit statistic for each response pattern will be computed.

"NQPT=(list)", "IDIST=n", "PMN=(list)", "PSD=(list)", "RSCTYPE=n", "LOCATION=(list)", "SCALE=(list)", "INFO=n", "BIWEIGHT", "YCOMMON", "POP", "MOMENTS", "FILE", "READF", "REFERENCE=n", "NFORMS=n"

See BILOG-MG manual for more details about these keywords/options.

calib_options

A string vector of keywords/options that will be added to the CALIB section in BILOG-MG syntax in addition to the keywords NQPT, CYCLES, NEWTON, CRIT, REFERENCE.

The default value is c("NORMAL").

When "NORMAL" is included in calib_options, the prior distributions of ability in the population is assumed to have normal distribution.

When "COMMON" is included in calib_options, a common value for the lower asymptote for all items in the 3PL model will be estimated.

Following keywords/options can be added to calib_options:

"PRINT=n", "IDIST=n", "PLOT=n", "DIAGNOSIS=n", "REFERENCE=n", "SELECT=(list)", "RIDGE=(a,b,c)", "ACCEL=n", "NSD=n", "COMMON", "EMPIRICAL", "NORMAL", "FIXED", "TPRIOR", "SPRIOR", "GPRIOR", "NOTPRIOR", "NOSPRIOR", "NOGPRIOR", "READPRIOR", "NOFLOAT", "FLOAT", "NOADJUST", "GROUP-PLOT", "RASCH", "NFULL", "CHI=(a,b)".

See BILOG-MG manual for more details about these keywords/options.

NOTE: Do not add any of the following keywords to calib_options since they will already be included:

NQPT, CYCLES, NEWTON, CRIT, REFERENCE

overwrite

If TRUE and there are already a BILOG-MG analysis files in the target path with the same name, these file will be overwritten.

show_output_on_console

logical (not NA), indicates whether to capture the output of the command and show it on the R console. The default value is TRUE.

bilog_exe_folder

The location of the "blm1.exe", "blm2.exe" and "blm3.exe" files. The default location is file.path("C:/Program Files/BILOGMG").

Value

A list of following objects:

"ip": An Itempool-class object holding the item parameters. This element will not be created when model = "CTT".
"score": A data frame object that holds the number of item examinee has attempted (tried), the number of item examinee got right (right), the estimated scores of examinees (ability), the standard errors of ability estimates (se), and the probability of the response string (prob). This element will not be created when model = "CTT".
"ctt": The Classical Test Theory (CTT) stats such as p-value, biserial, point-biserial estimated by BILOG-MG. If there are groups, then the CTT statistics for groups can be found in ctt$group$GROUP-NAME. Overall statistics for the whole group is at ctt$overall.
"failed_items": A data frame consist of items that cannot be estimated.
"syntax": The syntax file.
"converged": A logical value indicating whether a model has been converged or not. If the value is TRUE, model has been converged. This element will not be created when model = "CTT".
"neg_2_log_likelihood": -2 Log Likelihood value. This value is NULL, when model does not converge. This element will not be created when model = "CTT".
"input": A list object that stores the arguments that are passed to the function.

Examples

Run this code

# NOT RUN {
### Example 1 ###
# Create responses to be used in BILOG-MG estimation
true_ip <- generate_ip(n = 30, model = "2PL")
resp <- sim_resp(true_ip, rnorm(4000))

# The following line will run BILOG-MG, estimate 2PL model and put the
# analysis results under the target directory:
bilog_calib <- est_bilog(x = resp, model = "2PL",
                         target_dir = "C:/Temp/Analysis", overwrite = TRUE)
# Check whether the calibration converged
bilog_calib$converged

# Get the estimated item pool
bilog_calib$ip

# See the BILOG-MG syntax
cat(bilog_calib$syntax)

# See the classical test theory statistics estimated by BILOG-MG:
bilog_calib$ctt

# Get -2LogLikelihood for the model (mainly for model comparison purposes):
bilog_calib$neg_2_log_likelihood


### Example 2 ###
# Get Expected-a-posteriori theta scores:
result <- est_bilog(x = resp, model = "2PL",
                    scoring_options = c("METHOD=2", "NOPRINT"),
                    target_dir = "C:/Temp/Analysis", overwrite = TRUE)


### Example 3 ###
# Multi-group calibration
ip <- generate_ip(n = 35, model = "3PL", D = 1.7)
n_upper <- sample(1200:3000, 1)
n_lower <- sample(1900:2800, 1)
theta_upper <- rnorm(n_upper, 1.5, .25)
theta_lower <- rnorm(n_lower)
resp <- sim_resp(ip = ip, theta = c(theta_lower, theta_upper))
dt <- data.frame(level = c(rep("Lower", n_lower), rep("Upper", n_upper)), resp)

mg_calib <- est_bilog(x = dt, model = "3PL",
                      group_var = "level",
                      reference_group = "Lower",
                      items = 2:ncol(dt), # Exclude the 'group' column
                      num_of_alternatives = 5,
                      # Use MAP ability estimation.
                      # "FIT": calculate GOF for response patterns
                      scoring_options = c("METHOD=3", "NOPRINT", "FIT"),
                      target_dir = "C:/Temp/Analysis", overwrite = TRUE,
                      show_output_on_console = FALSE)
# Estimated item pool
mg_calib$ip
# Print group means
mg_calib$group_info
# Check Convergence
mg_calib$converged
# Print estimated scores of first five examinees
head(mg_calib$score)


### Example 4 ###
# When user wants to read BILOG-MG output saved in the directory "Analysis/"
# with file names "my_analysis", use the following syntax:
# (The following code does not require an installed BILOG-MG program.)
result <- est_bilog(target_dir = file.path("Analysis/"), model = "3PL",
                    analysis_name = "my_analysis", overwrite = FALSE)

# }

Run the code above in your browser using DataLab