jiebaR 中文分词

"结巴"中文分词的R语言版本，支持多种分词模式，同时有词性标注，关键词提取，文本Simhash相似度比较等功能。项目使用了Rcpp和CppJieba进行开发。

细胞词库转换可以使用 cidian 包：https://github.com/qinwf/cidian/

特性

支持 Windows，Linux，Mac 操作系统。
通过 Rcpp 实现同时加载多个分词系统,可以分别使用不同的分词模式和词库。
支持多种分词模式、中文姓名识别、关键词提取、词性标注以及文本Simhash相似度比较等功能。
支持加载自定义用户词库，设置词频、词性。
同时支持简体中文、繁体中文分词。
支持自动判断编码模式。
比原"结巴"中文分词速度快，是其他R分词包的5-20倍。
安装简单，无需复杂设置。
可以通过Rpy2，jvmr等被其他语言调用。
基于MIT协议。

安装

通过CRAN安装:

install.packages("jiebaR")
library("jiebaR")

cc = worker()
cc["这是一个测试"] # or segment("这是一个测试", cc)

# [1] "这是" "一个" "测试"

同时还可以通过Github安装开发版，建议使用 gcc >= 4.9 编译，Windows需要安装 Rtools ：

library(devtools)
install_github("qinwf/jiebaRD")
install_github("qinwf/jiebaR")
library("jiebaR")

使用指南与演示

使用指南：http://qinwenfeng.com/jiebaR/

正在撰写的文档 : https://jiebaR.qinwf.com/

Shiny 演示：https://qinwf.shinyapps.io/jiebaR-shiny/

细胞词库转换：https://github.com/qinwf/cidian/

问题

使用中遇到的任何问题，都可以：

访问使用指南：http://qinwenfeng.com/jiebaR/ ，并可在文档内评论
发送邮件至用户邮件列表　jiebaR@googlegroups.com
访问　https://groups.google.com/d/forum/jiebaR
在 GitHub 提交 issues。

jiebaR

This is a package for Chinese text segmentation, keyword extraction and speech tagging. jiebaR supports four types of segmentation modes: Maximum Probability, Hidden Markov Model, Query Segment and Mix Segment.

Features

Support Windows, Linux,and Mac.
Using Rcpp to load different segmentation worker at the same time.
Support Chinese text segmentation, keyword extraction, speech tagging and simhash computation.
Custom dictionary path.
Support simplified Chinese and traditional Chinese.
New words identification.
Auto encoding detection.
Fast text segmentation.
Easy installation.
MIT license.

Installation

Install the latest development version from GitHub:

devtools::install_github("qinwf/jiebaR")

Install from CRAN:

install.packages("jiebaR")

jiebaR 中文分词

特性

安装

使用指南与演示

问题

jiebaR

Features

Installation

Copy Link

Version

Install

Monthly Downloads

Version

License

Issues

Pull Requests

Stars

Forks

Repository

Maintainer

Last Published

Functions in jiebaR (0.9.1)

jiebaR 中文分词

特性

安装

使用指南 与 演示

问题

jiebaR

Features

Installation

Copy Link

Version

Install

Monthly Downloads

Version

License

Issues

Pull Requests

Stars

Forks

Repository

Maintainer

Last Published

Functions in jiebaR (0.9.1)

使用指南与演示