Learn R Programming

TopKLists (version 1.0.2)

compute.stream: Calculates point of degeneration j0 into noise of the Idata, applying moderate deviation-based inference

Description

The estimation of $\hat{j}_0$ is achieved via a moderate deviation-based approach. The probability that an estimator, computed from a pilot sample size $\nu$, exceeds a value z, the deviation above z is said to be a moderate deviation if its associated probability is polynomially small as a function of $\nu$, and to be a large deviation if the probability is exponentially small in $\nu$. The values of $z=z_\nu$ that are associated with moderate deviations are $z_\nu\equiv\bigl(C\,\nu^{-1}\,\log\nu\bigr)^{1/2}$, where $C>\frac{1}{4}$. The null hypothesis that $p_k=\frac{1}{2}$ for $\nu$ consecutive values of k, versus the alternative hypothesis that $p_k>\frac{1}{2}$ for at least one of the values of k, is rejected when $\hat{p}_j^\pm-\frac{1}{2}>z_\nu$. The probabilities $\hat{p}_j^+$ and $\hat{p}_j^-$ are estimates of $p_j$ computed from the $\nu$ data pairs $I_\ell$ for which $\ell$ lies immediately to the right of j, or immediately to the left of j, respectively. The iterative algorithm consists of an ordered sequence of "test stages" $s_1, s_2,\ldots$ In stage $s_k$ an integer $J_{s_k}$ is estimated, which is a potential lower bound to $j_0$ (when $k$ is odd), or a potential upper bound to $j_0$ (when $k$ is even).

Usage

compute.stream(Idata, const=0.251, v, r=1.2)

Arguments

Idata
Input data is a vector of 0s and 1s (see prepare.idata)
const
Denotes the constant C of the moderate deviation bound, needs to be larger than 0.25 (default is 0.251)
v
Denotes the pilot sample size $\nu$ related to the degree of randomness in the assignments. In each step the noise is estimated from the Idata as probability of 1 within the interval of size $\nu$, moving from $J_{s_{k-1}} -r \nu$ if $k$ is odd or $J_{s_{
r
Denotes a technical constant determining the starting point from which the probability for $I=1$ is estimated in a window of size v (see v, default is 1.2)

Value

  • A named list containing:
  • j0_estIs the estimated index for which the Idata degenerate into noise
  • reason.breakThe reason why the computation has ended - convergence or break condition
  • jsIs the sequence of estimated $j_0$ in each iteration run, also showing the convergence behaviour
  • vIs the preselected value of the parameter $\nu$

See Also

prepare.idata

Examples

Run this code
set.seed(465)
myhead <- rbinom(20, 1, 0.8)
mytail <- rbinom(20, 1, 0.5)
mydata <- c(myhead, mytail)
compute.stream(mydata, v=10)

Run the code above in your browser using DataLab