# score

##### Score a New Sequence Given an EMM

Calculates a score of how likely it is that a new sequence was generated by the same process as the sequences used to build the EMM.

- Keywords
- models

##### Usage

```
## S3 method for class 'EMM,matrix':
score(x, newdata, method = c("product", "log_sum", "sum",
"log_odds", "supported_transitions", "supported_states",
"sum_transitions", "log_loss", "likelihood", "log_likelihood", "AIC"),
match_cluster = "exact", prior=TRUE, normalize=TRUE,
initial_transition = FALSE, threshold = NA)
## S3 method for class 'EMM,EMM':
score(x, newdata, method = c("product", "log_sum", "sum",
"supported_transitions"), match_cluster = "exact", prior=TRUE,
initial_transition = FALSE)
```

##### Arguments

- x
- an
`EMM`

object. - newdata
- sequence or another
`EMM`

object to score. - method
- method to calculate the score (see details)
- match_cluster
- do the new observations have to fall within
the threshold of the cluster (
`"exact"`

) or is nearest neighbor (`"nn"`

) or weighted nearest neighbor (`weighted`

) used? - prior
- add one to each transition count. This is equal to start with a count of one for each transition, i.e. initially all transitions are equally likely. It prevents the product of probabilities to be z
- normalize
- normalize the score by the length of the sequence.
- initial_transition
- include the initial transition in the computation?
- threshold
- minimum count threshold used by supported transitions and supported states.

##### Details

The scores for a new sequence $x$ of length $l$ can be computed
by the following methods. For `match_cluster="exact"`

or `"nn"`

:
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
where
$x_i$ represents the $i$-th data point in the new sequence,
$a(i,j)$ is the transition probability from state $i$
to state $j$ in the model,
$s(i)$ is the state the $i$-th data point ($x_i$) in
the new sequence is assigned to.
$\mathrm{I(v)}$ is an indicator function which is 0 for $v=0$ and 1 otherwise.
For `match_cluster="weighted"`

:
[object Object],[object Object],[object Object],[object Object]

where $\mathrm{simil}(\cdot)$ is a modified and normalized similarity function given by $\mathrm{simil}(x,s) = 1- \frac{1}{1+e^{-\frac{\mathrm{d}(x, s)/t -1.5}{.2}}}$ where $d$ is the distance measure and $t$ is the threshold that was used to create the model.

##### Value

- A scalar score value.

##### See Also

`transition`

to access transition probabilities
and `find_clusters`

for assigning observations to states/clusters.

##### Examples

```
data("EMMsim")
emm <- EMM(threshold=.2)
emm <- build(emm, EMMsim_train)
score(emm, EMMsim_test) # default is method "product"
### create shuffled data (destroy temporal relationship)
### and create noisy data
test_shuffled <- EMMsim_test[sample(1:nrow(EMMsim_test)),]
test_noise <- jitter(EMMsim_test, amount=.3)
### helper for plotting
mybars <- function(...) {
oldpar <- par(mar=c(5,10,4,2))
ss <- rbind(...)
barplot(ss[,ncol(ss):1], xlim=c(-1,4), beside=TRUE,
horiz=TRUE, las=2,
legend = rownames(ss))
par(oldpar)
}
### compare various scores
methods <- c("product",
"sum",
"log_sum",
"supported_states",
"supported_transitions",
"sum_transitions",
"log_loss",
"likelihood")
### default is exact matching
clean <- sapply(methods, FUN=function(m) score(emm, EMMsim_test, method=m))
shuffled <- sapply(methods, FUN=function(m) score(emm, test_shuffled, method=m))
noise <- sapply(methods, FUN=function(m) score(emm, test_noise, method=m))
mybars(shuffled, noise, clean)
### weighted matching is better for noisy data
clean <- sapply(methods, FUN=function(m) score(emm, EMMsim_test, method=m,
match="weighted"))
shuffled <- sapply(methods, FUN=function(m) score(emm, test_shuffled, method=m,
match="weighted"))
noise <- sapply(methods, FUN=function(m) score(emm, test_noise, method=m,
match="weighted"))
mybars(shuffled, noise, clean)
```

*Documentation reproduced from package rEMM, version 1.0-11, License: GPL-2*