
Last chance! 50% off unlimited learning
Sale ends in
db.Rquery
object that can calculate the predictionsGenerate the db.Rquery
object that can calculate the
predictions for linear/logistic regressions. The actual result can be
viewed using lk
.
# S3 method for lm.madlib
predict(object, newdata, ...)# S3 method for lm.madlib.grps
predict(object, newdata, ...)
# S3 method for logregr.madlib
predict(object, newdata, type = c("response",
"prob"), ...)
# S3 method for logregr.madlib.grps
predict(object, newdata, type
= c("response", "prob"), ...)
# S3 method for glm.madlib
predict(object, newdata, type = c("response",
"prob"), ...)
# S3 method for glm.madlib.grps
predict(object, newdata, type = c("response",
"prob"), ...)
The result of madlib.lm
and madlib.glm
.
A db.obj
object, which contains the information about the
real data in the database.
A string, default is "response"
. It produces the predicted
results for the newdata
. The alternative value is
"prob"
, which is only used for binomial{logit}
to
compute the probabilities.
A string, default is "response"
, which produces the TRUE
or FALSE
prediction. If it is "prob"
, this function
computes the probabilities for TRUE
cases.
Extra parameters. Not implemented yet.
A '>db.Rquery
object, which contains the SQL query to
compute the predictions.
madlib.lm
linear regression
madlib.glm
logistic regression
lk
view the actual result
groups.lm.madlib
, groups.lm.madlib.grps
,
groups.logregr.madlib
,
groups.logregr.madlib.grps
extract grouping column
information from the fitted model(s).
# NOT RUN {
# }
# NOT RUN {
<!-- %% @test .port Database port number -->
<!-- %% @test .dbname Database name -->
## set up the database connection
## Assume that .port is port number and .dbname is the database name
cid <- db.connect(port = .port, dbname = .dbname, verbose = FALSE)
## create db.table object pointing to a data table
delete("abalone", conn.id = cid)
x <- as.db.data.frame(abalone, "abalone", conn.id = cid, verbose = FALSE)
## Example 1 --------
fit <- madlib.lm(rings ~ . - sex - id, data = x)
fit
pred <- predict(fit, x) # prediction
content(pred)
ans <- x$rings # the actual value
lk((ans - pred)^2, 10) # squared error
lk(mean((ans - pred)^2)) # mean squared error
## Example 2 ---------
y <- x
y$sex <- as.factor(y$sex)
fit <- madlib.lm(rings ~ . - id, data = y)
lk(mean((y$rings - predict(fit, y))^2))
## Example 3 ---------
fit <- madlib.lm(rings ~ . - id | sex, data = x)
fit
pred <- predict(fit, x)
content(pred)
ans <- x$rings
lk(mean((ans - pred)^2))
## predictions for one group of data where sex = I
idx <- which(groups(fit)[["sex"]] == "I") # which sub-model
pred1 <- predict(fit[[idx]], x[x$sex == "I",]) # predict on part of data
## Example 3 --------
## plot the predicted values v.s. the true values
ap <- ans # true values
ap$pred <- pred # add a column which is the predicted values
## If the data set is very big, you do not want to load all the
## data points into R and plot. We can just plot a random sample.
random.sample <- lk(sort(ap, FALSE, NULL), 1000) # sort randomly
plot(random.sample)
## ------------------------------------------------------------
## GLM prediction
fit <- madlib.glm(rings ~ . - id | sex, data = x, family = poisson(log),
control = list(max.iter = 20))
p <- predict(f)
lk(p, 10)
db.disconnect(cid, verbose = FALSE)
# }
Run the code above in your browser using DataLab