Slice operations behave as in dplyr, except the history graph can be updated with
tracked dataframe with the before and after sizes of the dataframe.
See dplyr::slice()
, dplyr::slice_head()
, dplyr::slice_tail()
,
dplyr::slice_min()
, dplyr::slice_max()
, dplyr::slice_sample()
,
for more details on the underlying functions.
# S3 method for trackr_df
slice_sample(
.data,
...,
.messages = c("{.count.in} before", "{.count.out} after"),
.headline = "slice data"
)
the sliced dataframe with the history graph updated.
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.
For slice()
: <data-masking
>
Integer row values.
Provide either positive values to keep, or negative values to drop. The values provided must be either all positive or all negative. Indices beyond the number of rows in the input are silently ignored.
For slice_*()
, these arguments are passed on to methods.
Named arguments passed on to dplyr::slice_sample
.by,by
<tidy-select
> Optionally, a selection of columns to
group by for just this operation, functioning as an alternative to group_by()
. For
details and examples, see ?dplyr_by.
.preserve
Relevant when the .data
input is grouped.
If .preserve = FALSE
(the default), the grouping structure
is recalculated based on the resulting data, otherwise the grouping is kept as is.
n,prop
Provide either n
, the number of rows, or prop
, the
proportion of rows to select. If neither are supplied, n = 1
will be
used. If n
is greater than the number of rows in the group
(or prop > 1
), the result will be silently truncated to the group size.
prop
will be rounded towards zero to generate an integer number of
rows.
A negative value of n
or prop
will be subtracted from the group
size. For example, n = -2
with a group of 5 rows will select 5 - 2 = 3
rows; prop = -0.25
with 8 rows will select 8 * (1 - 0.25) = 6 rows.
order_by
<data-masking
> Variable or
function of variables to order by. To order by multiple variables, wrap
them in a data frame or tibble.
with_ties
Should ties be kept together? The default, TRUE
,
may return more rows than you request. Use FALSE
to ignore ties,
and return the first n
rows.
na_rm
Should missing values in order_by
be removed from the result?
If FALSE
, NA
values are sorted to the end (like in arrange()
), so
they will only be included if there are insufficient non-missing values to
reach n
/prop
.
weight_by
<data-masking
> Sampling
weights. This must evaluate to a vector of non-negative numbers the same
length as the input. Weights are automatically standardised to sum to 1.
replace
Should sampling be performed with (TRUE
) or without
(FALSE
, the default) replacement.
a set of glue specs. The glue code can use any global variable, {.count.in}, {.count.out} for the input and output dataframes sizes respectively and {.excluded} for the difference
a glue spec. The glue code can use any global variable, {.count.in}, {.count.out} for the input and output dataframes sizes respectively.
dplyr::slice_sample()
library(dplyr)
library(dtrackr)
# In this example the iris dataframe is resampled 100 times with replacement
# within each group and the
iris %>%
track() %>%
group_by(Species) %>%
slice_sample(n=100, replace=TRUE,
.messages="{.count.out} / {.count.in} = {n}",
.headline="100 {Species}") %>%
history()
Run the code above in your browser using DataLab