localScore (version 1.0.6)

karlinMonteCarlo: Monte Carlo - Karlin [p-value]

Description

Estimates p-value, for integer scores, based on a Monte Carlo estimation of Gumble parameters from simulations of smaller sequences with same distribution. Appropriate for great sequences with length > 10^3, for i.i.d and markovian sequence models.

Usage

karlinMonteCarlo(
  local_score,
  sequence_length,
  simulated_sequence_length,
  FUN,
  ...,
  numSim = 1000,
  plot = TRUE
)

Arguments

local_score

local score observed in a segment.

sequence_length

length of the sequence

simulated_sequence_length

length of simulated sequences produced by FUN

FUN

function to simulate similar sequences with.

...

parameters for FUN

numSim

number of sequences to create for estimation

plot

boolean value if to display plots for cumulated function and density

Value

Floating value corresponding to the probability to obtain a local score with a value greater or equal to the parameter local_score

Details

The length of the simulated sequences is an argument specific to the function provided for simulation. Thus, it has to be provided also in the parameter simulated_sequence_length in the arguments of the "Monte Carlo - Karlin" function. It is a crucial detail as it influences precision and computation time of the result. Note that to get an appropriate estimation, the given average score must be non-positive.

Examples

Run this code
# NOT RUN {
new = sample(-7:6, replace = TRUE, size = 1000) 
#MonteCarlo taking random sample from the input sequence itself
# }
# NOT RUN {
karlinMonteCarlo(local_score = 66, sequence_length = 1000,  
               FUN = function(x, simulated_sequence_length) {return(sample(x = x, 
               size = simulated_sequence_length, replace = TRUE))}, 
               x=new, simulated_sequence_length = 1000,  numSim = 1000)
# }
# NOT RUN {
               
# }

Run the code above in your browser using DataCamp Workspace