stringdot

0th

Percentile

String Kernel Functions

String kernels.

Keywords
symbolmath
Usage
stringdot(length = 4, lambda = 1.1, type = "spectrum", normalized = TRUE)
Arguments
length

The length of the substrings considered

lambda

The decay factor

type

Type of string kernel, currently the following kernels are supported :

spectrum the kernel considers only matching substring of exactly length \(n\) (also know as string kernel). Each such matching substring is given a constant weight. The length parameter in this kernel has to be \(length > 1\).

boundrange this kernel (also known as boundrange) considers only matching substrings of length less than or equal to a given number N. This type of string kernel requires a length parameter \(length > 1\)

constant The kernel considers all matching substrings and assigns constant weight (e.g. 1) to each of them. This constant kernel does not require any additional parameter.

exponential Exponential Decay kernel where the substring weight decays as the matching substring gets longer. The kernel requires a decay factor \( \lambda > 1\)

string essentially identical to the spectrum kernel, only computed using a more conventional way.

fullstring essentially identical to the boundrange kernel only computed in a more conventional way.

normalized

normalize string kernel values, (default: TRUE)

Details

The kernel generating functions are used to initialize a kernel function which calculates the dot (inner) product between two feature vectors in a Hilbert Space. These functions or their function generating names can be passed as a kernel argument on almost all functions in kernlab(e.g., ksvm, kpca etc.).

The string kernels calculate similarities between two strings (e.g. texts or sequences) by matching the common substring in the strings. Different types of string kernel exists and are mainly distinguished by how the matching is performed i.e. some string kernels count the exact matchings of \(n\) characters (spectrum kernel) between the strings, others allow gaps (mismatch kernel) etc.

Value

Returns an S4 object of class stringkernel which extents the function class. The resulting function implements the given kernel calculating the inner (dot) product between two character vectors.

kpar

a list containing the kernel parameters (hyperparameters) used.

The kernel parameters can be accessed by the kpar function.

Note

The spectrum and boundrange kernel are faster and more efficient implementations of the string and fullstring kernels which will be still included in kernlab for the next two versions.

See Also

dots , kernelMatrix , kernelMult, kernelPol

Aliases
  • stringdot
Examples
# NOT RUN {
sk <- stringdot(type="string", length=5)

sk



# }
Documentation reproduced from package kernlab, version 0.9-27, License:

Community examples

Looks like there are no examples yet.