# lm_tidiers

##### Tidying methods for a linear model

These methods tidy the coefficients of a linear model into a summary, augment the original data with information on the fitted values and residuals, and construct a one-row glance of the model's statistics.

##### Usage

```
## S3 method for class 'lm':
tidy(x, conf.int = FALSE, conf.level = 0.95,
exponentiate = FALSE, ...)
```## S3 method for class 'lm':
augment(x, data = x$model, newdata, type.predict, type.residuals,
...)

## S3 method for class 'lm':
glance(x, ...)

##### Arguments

- x
- lm object
- conf.int
- whether to include a confidence interval
- conf.level
- confidence level of the interval, used only if
`conf.int=TRUE`

- exponentiate
- whether to exponentiate the coefficient estimates and confidence intervals (typical for logistic regression)
- ...
- extra arguments (not used)
- data
- Original data, defaults to the extracting it from the model
- newdata
- If provided, performs predictions on the new data
- type.predict
- Type of prediction to compute for a GLM; passed on to
`predict.glm`

- type.residuals
- Type of residuals to compute for a GLM; passed on to
`residuals.glm`

##### Details

If you have missing values in your model data, you may need to refit
the model with `na.action = na.exclude`

.

If `conf.int=TRUE`

, the confidence interval is computed with
the `confint`

function.

When the modeling was performed with `na.action = "na.omit"`

(as is the typical default), rows with NA in the initial data are omitted
entirely from the augmented data frame. When the modeling was performed
with `na.action = "na.exclude"`

, one should provide the original data
as a second argument, at which point the augmented data will contain those
rows (typically with NAs in place of the new columns). If the original data
is not provided to `augment`

and `na.action = "na.exclude"`

, a
warning is raised and the incomplete rows are dropped.

##### Value

- All tidying methods return a
`data.frame`

without rownames. The structure depends on the method chosen.`tidy.lm`

returns one row for each coefficient, with five columns: term The term in the linear model being estimated and tested estimate The estimated coefficient std.error The standard error from the linear model statistic t-statistic p.value two-sided p-value - If
`cont.int=TRUE`

, it also includes columns for`conf.low`

and`conf.high`

, computed with`confint`

.When

`newdata`

is not supplied`augment.lm`

returns one row for each observation, with seven columns added to the original data: .hat Diagonal of the hat matrix .sigma Estimate of residual standard deviation when corresponding observation is dropped from model .cooksd Cooks distance, `cooks.distance`

.fitted Fitted values of model .se.fit Standard errors of fitted values .resid Residuals .std.resid Standardised residuals - When
`newdata`

is supplied,`augment.lm`

returns one row for each observation, with three columns added to the new data: .fitted Fitted values of model .se.fit Standard errors of fitted values .resid Residuals of fitted values on the new data `glance.lm`

returns a one-row data.frame with the columnsr.squared The percent of variance explained by the model adj.r.squared r.squared adjusted based on the degrees of freedom sigma The square root of the estimated residual variance statistic F-statistic p.value p-value from the F test, describing whether the full regression is significant df Degrees of freedom used by the coefficients logLik the data's log-likelihood under the model AIC the Akaike Information Criterion BIC the Bayesian Information Criterion deviance deviance df.residual residual degrees of freedom

##### See Also

##### Examples

```
library(ggplot2)
library(dplyr)
mod <- lm(mpg ~ wt + qsec, data = mtcars)
tidy(mod)
glance(mod)
# coefficient plot
d <- tidy(mod) %>% mutate(low = estimate - std.error,
high = estimate + std.error)
ggplot(d, aes(estimate, term, xmin = low, xmax = high, height = 0)) +
geom_point() + geom_vline() + geom_errorbarh()
head(augment(mod))
head(augment(mod, mtcars))
# predict on new data
newdata <- mtcars %>% head(6) %>% mutate(wt = wt + 1)
augment(mod, newdata = newdata)
au <- augment(mod, data = mtcars)
plot(mod, which = 1)
qplot(.fitted, .resid, data = au) +
geom_hline(yintercept = 0) +
geom_smooth(se = FALSE)
qplot(.fitted, .std.resid, data = au) +
geom_hline(yintercept = 0) +
geom_smooth(se = FALSE)
qplot(.fitted, .std.resid, data = au,
colour = factor(cyl))
qplot(mpg, .std.resid, data = au, colour = factor(cyl))
plot(mod, which = 2)
qplot(sample =.std.resid, data = au, stat = "qq") +
geom_abline()
plot(mod, which = 3)
qplot(.fitted, sqrt(abs(.std.resid)), data = au) + geom_smooth(se = FALSE)
plot(mod, which = 4)
qplot(seq_along(.cooksd), .cooksd, data = au, geom = "bar",
stat="identity")
plot(mod, which = 5)
qplot(.hat, .std.resid, data = au) + geom_smooth(se = FALSE)
ggplot(au, aes(.hat, .std.resid)) +
geom_vline(size = 2, colour = "white", xintercept = 0) +
geom_hline(size = 2, colour = "white", yintercept = 0) +
geom_point() + geom_smooth(se = FALSE)
qplot(.hat, .std.resid, data = au, size = .cooksd) +
geom_smooth(se = FALSE, size = 0.5)
plot(mod, which = 6)
ggplot(au, aes(.hat, .cooksd)) +
geom_vline(xintercept = 0, colour = NA) +
geom_abline(slope = seq(0, 3, by = 0.5), colour = "white") +
geom_smooth(se = FALSE) +
geom_point()
qplot(.hat, .cooksd, size = .cooksd / .hat, data = au) + scale_size_area()
```

*Documentation reproduced from package broom, version 0.3.4, License: MIT + file LICENSE*