This function finds the element indices of partial matching or similar strings in a character vector. Can be used to find exact or slightly mistyped elements in a string vector.
str_pos(search.string, find.term, maxdist = 2, part.dist.match = 0,
show.pbar = FALSE)
Character vector with string elements.
String that should be matched against the elements of search.string
.
Maximum distance between two string elements, which is allowed to treat them as similar or equal. Smaller values mean less tolerance in matching.
Activates similar matching (close distance strings) for parts (substrings)
of the search.string
. Following values are accepted:
0 for no partial distance matching
1 for one-step matching, which means, only substrings of same length as find.term
are extracted from search.string
matching
2 for two-step matching, which means, substrings of same length as find.term
as well as strings with a slightly wider range are extracted from search.string
matching
Default value is 0. See 'Details' for more information.
Logical; f TRUE
, the progress bar is displayed when computing the distance matrix.
Default in FALSE
, hence the bar is hidden.
A numeric vector with index position of elements in search.string
that
partially match or are similar to find.term
. Returns -1
if no
match was found.
For part.dist.match = 1
, a substring of length(find.term)
is extracted
from search.string
, starting at position 0 in search.string
until
the end of search.string
is reached. Each substring is matched against
find.term
, and results with a maximum distance of maxdist
are considered as "matching". If part.dist.match = 2
, the range
of the extracted substring is increased by 2, i.e. the extracted substring
is two chars longer and so on.
# NOT RUN {
string <- c("Hello", "Helo", "Hole", "Apple", "Ape", "New", "Old", "System", "Systemic")
str_pos(string, "hel") # partial match
str_pos(string, "stem") # partial match
str_pos(string, "R") # no match
str_pos(string, "saste") # similarity to "System"
# finds two indices, because partial matching now
# also applies to "Systemic"
str_pos(string,
"sytsme",
part.dist.match = 1)
# finds nothing
str_pos("We are Sex Pistols!", "postils")
# finds partial matching of similarity
str_pos("We are Sex Pistols!", "postils", part.dist.match = 1)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab