Learn R Programming

⚠️There's a newer version (0.11.1) of this package.Take me there.

中文分词

细胞词库转换可以使用 cidian 包 :https://github.com/qinwf/cidian/

安装

通过CRAN安装:

install.packages("jiebaR")

同时还可以通过Github安装[开发版],建议使用 gcc >= 4.9 编译,Windows需要安装 Rtools

library(devtools)
install_github("qinwf/jiebaRD")
install_github("qinwf/jiebaR")

使用指南 与 演示

[使用指南 (已更新

Copy Link

Version

Install

install.packages('jiebaR')

Monthly Downloads

134

Version

0.10.99

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Qin Wenfeng

Last Published

March 29th, 2025

Functions in jiebaR (0.10.99)

keywords

Keyword extraction
<=.keywords

Keywords symbol
get_idf

generate IDF dict
<=.segment

Text segmentation symbol
tagging

Speech Tagging
simhash_dist

Compute Hamming distance of Simhash value
<=.qseg

Quick mode symbol
segment

Chinese text segmentation function
new_user_word

Add user word
print.inv

Print worker settings
tobin

simhash value to binary
vector_tag

Tag the a character vector
simhash

Simhash computation
show_dictpath

Show default path of dictionaries
get_qsegmodel

Set quick mode model
worker

Initialize jiebaR worker
<=.simhash

Simhash symbol
<=.tagger

Tagger symbol
apply_list

Apply list input to a worker
edit_dict

Edit default user dictionary
filter_segment

Filter segmentation result
file_coding

Files encoding detection
distance

Hamming distance of words
freq

The frequency of words
get_tuple

get tuple from the segmentation result
DICTPATH

The path of dictionary
jiebaR

A package for Chinese text segmentation