predict: Generate the `db.Rquery` object that can calculate the predictions

Description

Generate the db.Rquery object that can calculate the predictions for linear/logistic regressions. The actual result can be viewed using lk.

Usage

# S3 method for lm.madlib
predict(object, newdata, ...)
# S3 method for lm.madlib.grps
predict(object, newdata, ...)
# S3 method for logregr.madlib
predict(object, newdata, type = c("response",
                                  "prob"), ...)
# S3 method for logregr.madlib.grps
predict(object, newdata, type
= c("response", "prob"), ...)
# S3 method for glm.madlib
predict(object, newdata, type = c("response",
                                  "prob"), ...)
# S3 method for glm.madlib.grps
predict(object, newdata, type = c("response",
                                  "prob"), ...)

Arguments

object

The result of madlib.lm and madlib.glm.

newdata

A db.obj object, which contains the information about the real data in the database.

type

A string, default is "response". It produces the predicted results for the newdata. The alternative value is "prob", which is only used for binomial{logit} to compute the probabilities.

A string, default is "response", which produces the TRUE or FALSE prediction. If it is "prob", this function computes the probabilities for TRUE cases.

…

Extra parameters. Not implemented yet.

Value

A '>db.Rquery object, which contains the SQL query to compute the predictions.

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
<!-- %% @test .port Database port number -->

<!-- %% @test .dbname Database name -->
## set up the database connection
## Assume that .port is port number and .dbname is the database name
cid <- db.connect(port = .port, dbname = .dbname, verbose = FALSE)

## create db.table object pointing to a data table
delete("abalone", conn.id = cid)
x <- as.db.data.frame(abalone, "abalone", conn.id = cid, verbose = FALSE)

## Example 1 --------

fit <- madlib.lm(rings ~ . - sex - id, data = x)

fit

pred <- predict(fit, x) # prediction

content(pred)

ans <- x$rings # the actual value

lk((ans - pred)^2, 10) # squared error

lk(mean((ans - pred)^2)) # mean squared error

## Example 2 ---------

y <- x
y$sex <- as.factor(y$sex)
fit <- madlib.lm(rings ~ . - id, data = y)

lk(mean((y$rings - predict(fit, y))^2))

## Example 3 ---------

fit <- madlib.lm(rings ~ . - id | sex, data = x)

fit

pred <- predict(fit, x)

content(pred)

ans <- x$rings

lk(mean((ans - pred)^2))

## predictions for one group of data where sex = I
idx <- which(groups(fit)[["sex"]] == "I") # which sub-model
pred1 <- predict(fit[[idx]], x[x$sex == "I",]) # predict on part of data

## Example 3 --------

## plot the predicted values v.s. the true values
ap <- ans # true values
ap$pred <- pred # add a column which is the predicted values

## If the data set is very big, you do not want to load all the
## data points into R and plot. We can just plot a random sample.
random.sample <- lk(sort(ap, FALSE, NULL), 1000) # sort randomly

plot(random.sample)

## ------------------------------------------------------------
## GLM prediction

fit <- madlib.glm(rings ~ . - id | sex, data = x, family = poisson(log),
                  control = list(max.iter = 20))

p <- predict(f)

lk(p, 10)

db.disconnect(cid, verbose = FALSE)
# }

Run the code above in your browser using DataLab

Last chance! 50% off unlimited learning