List distances to nearest neighbors of a given kanji in terms of a reference distance (which is currently only the stroke edit distance) and compare with values in terms of another distance (currently only the component transport distance, a.k.a. kanji distance).
compare_neighborhoods(
kan,
refdist = "strokedit",
refnn = 10,
compdist = "kanjidist",
compnn = 0,
...
)
A matrix of distances with refnn + compnn
columns named by the nearest neighbors of kan
(first
in terms of the reference distance, then the other distances) and 1 + length(compdist)
rows named
by the type of distance.
a kanji (currently only as a single UTF-8 character).
the name of the reference distance (currently only "strokedit").
the number of nearest neighbors in terms of the reference distance.
a character vector. The name(s) of one or several other distances to compare with (currently only "kanjidist").
the number of nearest neighbors in terms of the other distance(s). If this is positive it is assumed that the suggested package kanjistat.data is available.
further parameters that are passed to kanjidist()
.
This is only a first draft of the function and its interface and details may change considerably in the future.
As there is currently no precomputed kanjidist matrix, there is a huge difference in computation time between
setting compnn = 0
(only kanji distances to the refnn
nearest neighbors in terms of refdist
have to be
computed) and setting compnn
to any value $> 0$ (kanji distances to all 2135 other Jouyou kanji have to be
computed in order to determine the compnn
nearest neighbors; depending on the system and parameter settings
this can take (roughly) anywhere between 2 minutes and an hour).
# compare_neighborhoods("\u6674", refnn=5, compo_seg_depth=4, approx="pcweighted",
# compnn=0, minor_warnings=FALSE)
Run the code above in your browser using DataLab