# alike

## Performance Considerations

### Sample Timings

We have gone to great lengths to make alike fast so that it can be included in other functions without concerns for what overhead:

type_and_len <- function(a, b) typeof(a) == typeof(b) && length(a) == length(b) # for reference bench_mark(times=1e4, identical(rivers, rivers), alike(rivers, rivers), type_and_len(rivers, rivers) )

While alike is slower than identical and the comparable bare bones R function, it is competitive with a bare bones R function that checks types and length. As objects grow more complex, identical will obviously pull ahead, though alike should be sufficiently fast for most applications:

bench_mark(times=1e4, identical(mtcars, mtcars), alike(mtcars, mtcars) )

In the above example, we are comparing the data frames, their attributes, and the 11 columns individually.

Keep in mind that the complexity of the alike comparison is driven by the complexity of the template, not the object we are checking, so we can always manage the expense of the alike evaluation.

Comparisons that succeed will be substantially faster than comparisons that fail as the construction of error messages is non-trivial and we have prioritized optimization in the success case.

Language object comparison is relatively slow. We intend to optimize this some day.

Templates with large numbers of attributes (e.g. > 25) may scale non-linearly. We intend to optimize this some day, though in our experience objects with that many attributes are rare (note having multiple objects each with a handful attributes nested in recursive structures is not a problem).

Large objects will be slower to evaluate. Let us revisit the lm example, though this time we compare our template to itself to ensure that the comparisons succeed for alike, all.equal, and identical:

mdl.tpl <- abstract(lm(y ~ x + z, data.frame(x=runif(3), y=runif(3), z=runif(3)))) # compare mdl.tpl to itself to ensure success in all three scenarios bench_mark( alike(mdl.tpl, mdl.tpl), all.equal(mdl.tpl, mdl.tpl), # for reference identical(mdl.tpl, mdl.tpl) )

Even with template as large as lm results (check str(mdl.tpl)) we can evaluate alike thousands of times before the overhead becomes noticeable.

### Pre-defining Templates

Some fairly innocuous R expressions carry substantial overhead. Consider:

df.tpl <- data.frame(a=integer(), b=numeric()) df.cur <- data.frame(a=1:10, b=1:10 + .1) bench_mark( alike(df.tpl, df.cur), alike(data.frame(integer(), numeric()), df.cur) )

data.frame is a particularly slow constructor, but in general you are best served by defining your templates (including calls to abstract) outside of your function so they are created on package load rather than every time your function is called.

## Miscellaneous

### alike as an S3 generic

alike is not currently an S3 generic, but will likely one in the future provided we can create an implementation with and acceptable performance profile.