Estimates p-value, for integer scores, based on a Monte Carlo estimation of Gumble parameters from simulations of smaller sequences with same distribution. Appropriate for great sequences with length > 10^3, for i.i.d and markovian sequence models.
karlinMonteCarlo(
local_score,
sequence_length,
simulated_sequence_length,
FUN,
...,
numSim = 1000,
plot = TRUE
)
Floating value corresponding to the probability to obtain a local score with a value greater or equal to the parameter local_score
local score observed in a segment.
length of the sequence
length of simulated sequences produced by FUN
function to simulate similar sequences with.
parameters for FUN
number of sequences to create for estimation
boolean value if to display plots for cumulated function and density
The length of the simulated sequences is an argument specific to the function provided for simulation. Thus, it has to be provided also in the parameter simulated_sequence_length in the arguments of the "Monte Carlo - Karlin" function. It is a crucial detail as it influences precision and computation time of the result. Note that to get an appropriate estimation, the given average score must be non-positive.
new = sample(-7:6, replace = TRUE, size = 1000)
#MonteCarlo taking random sample from the input sequence itself
# \donttest{
karlinMonteCarlo(local_score = 66, sequence_length = 1000,
FUN = function(x, simulated_sequence_length) {return(sample(x = x,
size = simulated_sequence_length, replace = TRUE))},
x=new, simulated_sequence_length = 1000, numSim = 1000)
# }
Run the code above in your browser using DataLab