# Marek Gagolewski

#### 8 packages on CRAN

Allows for fast, correct, consistent, portable, as well as convenient character string/text processing in every locale and any native encoding. Owing to the use of the 'ICU' library, the package provides 'R' users with platform-independent functions known to 'Java', 'Perl', 'Python', 'PHP', and 'Ruby' programmers. Available features include: pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, Unicode normalization, date-time formatting and parsing, and many more.

S4 classes and methods to deal with fuzzy numbers. They allow for computing any arithmetic operations (e.g., by using the Zadeh extension principle), performing approximation of arbitrary fuzzy numbers by trapezoidal and piecewise linear ones, preparing plots for publications, computing possibility and necessity values for comparisons, etc.

Tools supporting multi-criteria decision making, including variable number of criteria, by means of aggregation operators and preordered sets. Possible applications include, but are not limited to, scientometrics and bibliometrics.

Supports quantitative research in scientometrics and bibliometrics. Provides various tools for preprocessing bibliographic data retrieved, e.g., from Elsevier's SciVerse Scopus, computing bibliometric impact of individuals, or modeling many phenomena encountered in the social sciences.

A new hierarchical clustering linkage criterion: the Genie algorithm links two clusters in such a way that a chosen economic inequity measure (e.g., the Gini index) of the cluster sizes does not increase drastically above a given threshold. Benchmarks indicate a high practical usefulness of the introduced method: it most often outperforms the Ward or average linkage in terms of the clustering quality while retaining the single linkage speed, see (Gagolewski et al. 2016a <DOI:10.1016/j.ins.2016.05.003>, 2016b <DOI:10.1007/978-3-319-45656-0_16>) for more details.

An implementation of turtle graphics <http://en.wikipedia.org/wiki/Turtle_graphics>. Turtle graphics comes from Papert's language Logo and has been used to teach concepts of computer programming.

RE2 <https://github.com/google/re2> is a primarily deterministic finite automaton based regular expression engine from Google that is very fast at matching large amounts of text.

An Implementation of a novel method to determine similarity of R functions based on program dependence graphs, see Bartoszuk, Gagolewski (2017) <doi:10.1109/FUZZ-IEEE.2017.8015582>. Possible use cases include plagiarism detection among students' homework assignments.