This is a parallelized version of MeCab part-of-speech tagger. The function gets a
character vector of any length and runs a loop inside C++ with Intel TBB to provide faster
processing.
Parallelizing over a character vector is not supported by RcppParallel.
Thus, this function makes duplicates of the input and the output.
Therefore, if your data volume is large, use pos or divide the vector to
several sub-vectors.
You can add a user dictionary to user_dic. It should be compiled by
mecab-dict-index. You can find an explatation about compiling a user
dictionary in the https://github.com/junhewk/RcppMeCab.
You can also set a system dictionary especially if you are using multiple
dictionaries (for example, using both IPA and Juman dictionary at the same time in Japanese)
in sys_dic. Using options(mecabSysDic=), you can set your
prefered system dictionary to the R terminal.
If you want to get a morpheme only, use join = False to put tag names on the attribute.
Basically, the function will return a list of character vectors with (morpheme)/(tag) elements.