Learn R Programming

gausscov (version 1.1.8)

The Gaussian Covariate Method for Variable Selection

Description

The standard linear regression theory whether frequentist or Bayesian is based on an 'assumed (revealed?) truth' (John Tukey) attitude to models. This is reflected in the language of statistical inference which involves a concept of truth, for example confidence intervals, hypothesis testing and consistency. The motivation behind this package was to remove the word true from the theory and practice of linear regression and to replace it by approximation. The approximations considered are the least squares approximations. An approximation is called valid if it contains no irrelevant covariates. This is operationalized using the concept of a Gaussian P-value which is the probability that pure Gaussian noise is better in term of least squares than the covariate. The precise definition given in the paper "An Approximation Based Theory of Linear Regression". Only four simple equations are required. Moreover the Gaussian P-values can be simply derived from standard F P-values. Furthermore they are exact and valid whatever the data in contrast F P-values are only valid for specially designed simulations. A valid approximation is one where all the Gaussian P-values are less than a threshold p0 specified by the statistician, in this package with the default value 0.01. This approximations approach is not only much simpler it is overwhelmingly better than the standard model based approach. The will be demonstrated using high dimensional regression and vector autoregression real data sets. The goal is to find valid approximations. The search function is f1st which is a greedy forward selection procedure which results in either just one or no approximations which may however not be valid. If the size is less than than a threshold with default value 21 then an all subset procedure is called which returns the best valid subset. A good default start is f1st(y,x,kmn=15) The best function for returning multiple approximations is f3st which repeatedly calls f1st. For more information see the papers: L. Davies and L. Duembgen, "Covariate Selection Based on a Model-free Approach to Linear Regression with Exact Probabilities", , L. Davies, "An Approximation Based Theory of Linear Regression", 2024, .

Copy Link

Version

Install

install.packages('gausscov')

Monthly Downloads

579

Version

1.1.8

License

GPL-3

Maintainer

Laurie Davies

Last Published

July 9th, 2025

Functions in gausscov (1.1.8)

fgenbsf

Generates basis functions on disjoint intervals
leukeia

Leukemia data set
fundr

Converts directed into an undirected graph
fgr1st

Calculates a dependence graph using Gaussian stepwise selection
fgentrig

Generation of sine and cosine functions
redwine

Redwine data
lymphoma

Lymphoma data set
vardata

USA economics data
snspt

Sunspot data
flag

Calculation of lagged covariates
fpval

Calculates the regression coefficients, the P-values and the standard P-values for the chosen subset ind
mel-temp

Melbourne minimum temperature
m15005m

m15005m data
decode

Decodes the number of a subset selected by fasb.R to give the covariates
f2st

Repeated stepwise selection of covariates
f3sti

Selection of covariates with given excluded covariates
f3st

Stepwise selection of covariates
fdecode

Decodes the number of a subset selected by fasb.R to give the covariates
f1st

Stepwise selection of covariates
f1bsf

Stepwise selection of interval covariates in non-paametric regression
fasb

Calculates all subsets where each included covariate is significant.
fselect

Selects the subsets specified by fasb.R and frasb.R.
abcq

American Business Cycle
boston

Boston data
fgeninter

Generation of interactions