Learn R Programming

npRmpi (version 0.60-20)

npconmode: Kernel Modal Regression with Mixed Data Types

Description

npconmode performs kernel modal regression on mixed data, and finds the conditional mode given a set of training data, consisting of explanatory data and dependent data, and possibly evaluation data. Automatically computes various in sample and out of sample measures of accuracy.

Usage

npconmode(bws, ...)

# S3 method for formula npconmode(bws, data = NULL, newdata = NULL, ...)

# S3 method for call npconmode(bws, ...)

# S3 method for default npconmode(bws, txdat, tydat, ...)

# S3 method for conbandwidth npconmode(bws, txdat = stop("invoked without training data 'txdat'"), tydat = stop("invoked without training data 'tydat'"), exdat, eydat, ...)

Value

npconmode returns a conmode object with the following components:

conmode

a vector of type factor (or ordered factor) containing the conditional mode at each evaluation point

condens

a vector of numeric type containing the modal density estimates at each evaluation point

xeval

a data frame of evaluation points

yeval

a vector of type factor (or ordered factor) containing the actual outcomes, or NA if not provided

confusion.matrix

the confusion matrix or NA if outcomes are not available

CCR.overall

the overall correct classification ratio, or NA if outcomes are not available

CCR.byoutcome

a numeric vector containing the correct classification ratio by outcome, or NA if outcomes are not available

fit.mcfadden

the McFadden-Puig-Kerschner performance measure or NA if outcomes are not available

The functions mode, and fitted may be used to extract the conditional mode estimates, and the conditional density estimates at the conditional mode, respectively, from the resulting object. Also, summary supports

conmode objects.

Arguments

bws

a bandwidth specification. This can be set as a conbandwidth object returned from an invocation of npcdensbw

...

additional arguments supplied to specify the bandwidth type, kernel types, and so on, detailed below. This is necessary if you specify bws as a \(p+q\)-vector and not a conbandwidth object, and you do not desire the default behaviours.

data

an optional data frame, list or environment (or object coercible to a data frame by as.data.frame) containing the variables in the model. If not found in data, the variables are taken from environment(bws), typically the environment from which npcdensbw was called.

newdata

An optional data frame in which to look for evaluation data. If omitted, the training data are used.

txdat

a \(p\)-variate data frame of explanatory data (conditioning data) used to calculate the regression estimators. Defaults to the training data used to compute the bandwidth object.

tydat

a one (1) dimensional vector of unordered or ordered factors, containing the dependent data. Defaults to the training data used to compute the bandwidth object.

exdat

a \(p\)-variate data frame of points on which the regression will be estimated (evaluation data). By default, evaluation takes place on the data provided by txdat.

eydat

a one (1) dimensional numeric or integer vector of the true values (outcomes) of the dependent variable. By default, evaluation takes place on the data provided by tydat.

Author

Tristen Hayfield tristen.hayfield@gmail.com, Jeffrey S. Racine racinej@mcmaster.ca

Usage Issues

If you are using data of mixed types, then it is advisable to use the data.frame function to construct your input data and not cbind, since cbind will typically not work as intended on mixed data types and will coerce the data to the same type.

References

Aitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,” Biometrika, 63, 413-420.

Hall, P. and J.S. Racine and Q. Li (2004), “Cross-validation and the estimation of conditional probability densities,” Journal of the American Statistical Association, 99, 1015-1026.

Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.

McFadden, D. and C. Puig and D. Kerschner (1977), “Determinants of the long-run demand for electricity,” Proceedings of the American Statistical Association (Business and Economics Section), 109-117.

Pagan, A. and A. Ullah (1999), Nonparametric Econometrics, Cambridge University Press.

Scott, D.W. (1992), Multivariate Density Estimation. Theory, Practice and Visualization, New York: Wiley.

Silverman, B.W. (1986), Density Estimation, London: Chapman and Hall.

Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,” Biometrika, 68, 301-309.

Examples

Run this code
if (FALSE) {
## Not run in checks: excluded to keep MPI examples stable and check times short.
## The following example is adapted for interactive parallel execution
## in R. Here we spawn 1 slave so that there will be two compute nodes
## (master and slave).  Kindly see the batch examples in the demos
## directory (npRmpi/demos) and study them carefully. Also kindly see
## the more extensive examples in the np package itself. See the npRmpi
## vignette for further details on running parallel np programs via
## vignette("npRmpi",package="npRmpi").

## Start npRmpi for interactive execution. If slaves are already running and
## `options(npRmpi.reuse.slaves=TRUE)` (default on some systems), this will
## reuse the existing pool instead of respawning. To change the number of
## slaves, call `npRmpi.stop(force=TRUE)` then restart.
npRmpi.start(nslaves=1)

mpi.bcast.cmd(library(MASS),caller.execute=TRUE)
mpi.bcast.cmd(data(birthwt),caller.execute=TRUE)

birthwt$low <- factor(birthwt$low)
birthwt$smoke <- factor(birthwt$smoke)
birthwt$race <- factor(birthwt$race)
birthwt$ht <- factor(birthwt$ht)
birthwt$ui <- factor(birthwt$ui)
birthwt$ftv <- ordered(birthwt$ftv)

mpi.bcast.Robj2slave(birthwt)

mpi.bcast.cmd(bw <- npcdensbw(low~
                              smoke+ 
                              race+ 
                              ht+ 
                              ui+    
                              ftv+  
                              age+           
                              lwt,
                              data=birthwt),
              caller.execute=TRUE)

summary(bw)

mpi.bcast.cmd(model <- npconmode(bws=bw),
              caller.execute=TRUE)

summary(model)

## For the interactive run only we close the slaves perhaps to proceed
## with other examples and so forth. This is redundant in batch mode.

## Note: on some systems (notably macOS+MPICH), repeatedly spawning and
## tearing down slaves in the same R session can lead to hangs/crashes.
## npRmpi may therefore keep slave daemons alive by default and
## `npRmpi.stop()` performs a "soft close". Use `force=TRUE` to
## actually shut down the slaves.
##
## You can disable reuse via `options(npRmpi.reuse.slaves=FALSE)` or by
## setting the environment variable `NP_RMPI_NO_REUSE_SLAVES=1` before
## loading the package.

npRmpi.stop()               ## soft close (may keep slaves alive)
## npRmpi.stop(force=TRUE)  ## hard close

## Note that in order to exit npRmpi properly avoid quit(), and instead
## use mpi.quit() as follows.

## mpi.bcast.cmd(mpi.quit(),
##               caller.execute=TRUE)
} 

Run the code above in your browser using DataLab