Learn R Programming

KSD (version 1.0.1)

KSD: Estimate Kernelized Stein Discrepancy (KSD)

Description

Estimate kernelized Stein discrepancy (KSD) using U-statistics, and use bootstrap to test H0: \(x_i\) is drawn from \(p(X)\) (via KSD=0).

Usage

KSD(x, score_function, kernel = "rbf", width = -1, nboot = 1000)

Arguments

x

Sample of size Num_Instance x Num_Dimension

score_function

(\(\nabla_x \log p(x)\)) Score funtion : takes x as input and output a column vector of size Num_Instance X Dimension. User may use pryr package to pass in a function that only takes in dataset as parameter, or user may also pass in computed score for a given dataset.

kernel

Type of kernel (default = 'rbf')

width

Bandwidth of the kernel (when width = -1 or 'median', set it to be the median distance between data points)

nboot

Bootstrap sample size

Value

A list which includes the following variables :

  • "ksd" : Estimated Kernelized Stein Discrepancy (KSD)

  • "p" : p-Value for rejecting the null hypothesis that ksd = 0

  • "bootstrapSamples" : the bootstrap sample

  • "info": other information, including : bandwidth, M, nboot, ksd_V

Examples

Run this code
# NOT RUN {
# Pass in a dataset generated by Gaussian distribution,
# use pryr package to pass in score function
model <- gmm()
X <- rgmm(model, n=100)
score_function = pryr::partial(scorefunctiongmm, model=model)
result <- KSD(X,score_function=score_function)

# Pass in a dataset generated by Gaussian distribution,
# pass in computed score rather than score function
model <- gmm()
X <- rgmm(model, n=100)
score_function = scorefunctiongmm(model=model, X=X)
result <- KSD(X,score_function=score_function)

# Pass in a dataset generated by Gaussian distribution,
# pass in computed score rather than score function
# Use median_heuristic by specifying width to be -2.0
model <- gmm()
X <- rgmm(model, n=100)
score_function = pryr::partial(scorefunctiongmm, model=model)
result <- KSD(X,score_function=score_function, 'rbf',-2.0)

# Pass in a dataset generated by specific Gaussian distribution,
# pass in computed score rather than score function
# Use median_heuristic by specifying width to be -2.0
model <- gmm()
X <- rgmm(model, n=100)
score_function = pryr::partial(scorefunctiongmm, model=model)
result <- KSD(X,score_function=score_function, 'rbf',-2.0)
# }

Run the code above in your browser using DataLab