Learn R Programming

MixSemiRob (version 1.1.0)

kdeem.h: Kernel Density-based EM-type algorithm for Semiparametric Mixture Regression with Unspecified Homogenous Error Distributions

Description

`kdeem.h' is used for semiparametric mixture regression using a kernel density-based expectation-maximization (EM)-type algorithm with unspecified homogeneous error distributions (Hunter and Young, 2012).

Usage

kdeem.h(x, y, C = 2, ini = NULL, maxiter = 200)

Value

A list containing the following elements:

posterior

posterior probabilities of each observation belonging to each component.

beta

estimated regression coefficients.

pi

estimated mixing proportions.

h

bandwidth used for the kernel estimation.

Arguments

x

an n by p data matrix where n is the number of observations and p is the number of explanatory variables (including the intercept).

y

an n-dimensional vector of response variable.

C

number of mixture components. Default is 2.

ini

initial values for the parameters. Default is NULL, which obtains the initial values using the kdeem.lse function. If specified, it can be a list with the form of list(beta, prop, tau, pi, h), where beta is a p by C matrix for regression coefficients of C components, prop is an n by C matrix for probabilities of each observation belonging to each component, calculated based on the initial beta and h, tau is a vector of C precision parameters (inverse of standard deviation), pi is a vector of C mixing proportions, and h is the bandwidth for kernel estimation.

maxiter

maximum number of iterations for the algorithm. Default is 200.

Details

'kdeem.h' can be used to estimate parameters in a mixture-of-regressions model with independent identically distributed errors. The model is defined as follows: $$f_{Y|\boldsymbol{X}}(y,\boldsymbol{x},\boldsymbol{\theta},g) = \sum_{j=1}^C\pi_jg(y-\boldsymbol{x}^{\top}\boldsymbol{\beta}_j).$$ Here, \(\boldsymbol{\theta}=(\pi_1,...,\pi_{C-1},\boldsymbol{\beta}_1^{\top},\cdots,\boldsymbol{\beta}_C^{\top})\), and \(g(\cdot)\) represents identical unspecified density functions. The bandwidth of the kernel density estimation is calculated adaptively using the bw.SJ function from the `stats' package, which implements the method of Sheather & Jones (1991) for bandwidth selection based on pilot estimation of derivatives.

For the calculation of \(\beta\) in the M-step, this function employs the universal optimizer ucminf from the `ucminf' package.

References

Hunter, D. R., & Young, D. S. (2012). Semiparametric mixtures of regressions. Journal of Nonparametric Statistics, 24(1), 19-38.

Ma, Y., Wang, S., Xu, L., & Yao, W. (2021). Semiparametric mixture regression with unspecified error distributions. Test, 30, 429-444.

See Also

kdeem, kdeem.lse, bw.SJ for bandwidth calculation, and ucminf for beta calculation.

Examples

Run this code
# See examples for the `kdeem' function.

Run the code above in your browser using DataLab