splitSample(env, chunk = 10, take, nchunk,
fill = c("head", "tail", "random"),
maxit = 1000)
chunk
and sum(chunk)
must equal
take
. Can be missing (the default), in which case some simple
heuristics are used to determine the number otake
observations. Basically here to stop the loop
going on forever.lengths
which indicates how many samples were
actually chosen from each chunk.chunk
sections and samples are
selected from each chunk to result in a sample of length
take
. If take
is divisible by chunk
without
remainder then there will an equal number of samples selected from
each chunk. Where chunk
is not a multiple of take
and
nchunk
is not specified then extra samples have to be allocated
to some of the chunks to reach the required number of samples
selected. An additional complication is that some chunks of the gradient may
have fewer than nchunk
samples and therefore more samples need
to be selected from the remaining chunks until take
samples are
chosen.
If nchunk
is supplied, it must be a vector stating exactly how
many samples to select from each chunk. If chunk
is not
supplied, then the number of samples per chunk is determined as
follows:
floor(take / chunk)
is assigned
to each chunknchunk
are reset to the number of samples
in those chunkstake
samples are
selected. Argument fill
controls the order in which the chunks are
filled. fill = "head"
fills from the low to the high end of the
gradient, whilst fill = "tail"
fills in the opposite
direction. Chunks are filled in random order if fill =
"random"
. In all cases no chunk is filled by more than one extra
sample until all chunks that can supply one extra sample are
filled. In the case of fill = "head"
or fill = "tail"
this entails moving along the gradient from one end to the other
allocating an extra sample to available chunks before starting along
the gradient again. For fill = "random"
, a random order of
chunks to fill is determined, if an extra sample is allocated to each
chunk in the random order and take
samples are still not
selected, filling begins again using the same random ordering. In
other words, the random order of chunks to fill is chosen only once.
data(swappH)
## take a test set of 20 samples along the pH gradient
test1 <- splitSample(swappH, chunk = 10, take = 20)
test1
swappH[test1]
## take a larger sample where some chunks don't have many samples
## do random filling
set.seed(3)
test2 <- splitSample(swappH, chunk = 10, take = 70, fill = "random")
test2
swappH[test2]
Run the code above in your browser using DataLab