Learn R Programming

polyreg (version 0.0.0.9600)

polyFit: Polynomial Fit

Description

Fit polynomial regression using a linear or logistic model; predict new data.

Usage

polyFit(xy,deg,maxInteractDeg,use='lm',pcaMethod=NULL,pcaLocation='front',
   pcaPortion=0.9,glmMethod='one',cls=NULL,del0cols=TRUE) 
predict.polyFit(object,newdata)

Arguments

xy

Data frame with response variable in the last column. In the classification case, response is class ID, stored in a vector, not as a factor.

maxDeg

The max degree for polynomial terms.

maxInteractDeg

The max degree of interaction terms

use

Set to 'lm' for linear regression, 'glm' for logistic regression, or 'mvrlm' for multivariate-response lm.

pcaMethod

NULL for no PCA. For PCA, can be either 'prcomp' (use the prcomp function) or 'RSpectra' (use the eigs function in the RSpectra package.

pcaLocation

In case PCA is applied, specify 'front' to have PCA calculated before forming polynomials, otherwise 'back.

pcaPortion

If less than 1.0, use as many principal components so as to achieve this portion of total variance. Otherwise, use this many components. In the 'RSpectra' case, this value must be an integer of 1 or more.

cls

Virtual cluster, for parallel computation (One Vs. All case).

del0cols

Delete all-0 columns in polynomial data frame. See getPoly documentation.

newdata

Data frame, one row for each "X" to be predicted. Must have the same column names as in xy (without "Y").

Value

The return value of polyFit() is an polyFit object. The orginal arguments are retained, along with the fitted models and so on.

The prediction function predict.polyFit returns the predicted value(s) for newdata. In the classification case, these will be the predicted class labels, 1,2,3,...

Details

The polyFit function calls getPoly to generate polynomial terms from predictor variables, then fits the generated data to a linear or logistic regression model. (Powers of dummy variables will not be generated, other than degree 1, but interaction terms will calculated.

If pcaMethod is not NULL, a principal component analysis is performed before or after generating the polynomials.

When logistic regression for classification is indicated, with more than two classes, All-vs-All or One-vs-All methods, coded 'all' and 'one', can be applied to deal with multiclass problem. Multinomial logit ('multilog') is also available.

Under the 'mvrlm' option in a classification problem, lm is called with multivariate response, using cbind and dummy variables for class membership as the response. Since predictors are used to form polynomials, this should be a reasonable model, and is much faster than 'glm'.

Examples

Run this code
# NOT RUN {
getPE(Dummies=T)  # prgeng data
pe1 <- pe[,c(1,2,4,6,7,12:16,3)]   # select some predictors
pfout <- polyFit(pe1,2) 
# predict worker like pe1[1,] but age 42 with MS degree
newdata <- pe1[1,-11] 
newdata[1,1] <- 42 
newdata[1,4] <- 1 
predict(pfout,newdata[,-11])  # 81022.27

# }

Run the code above in your browser using DataLab