stringi (version 1.0-1)

stri_stats_general: General Statistics for a Character Vector

Description

This function gives general statistics for a character vector, e.g. obtained by loading a text file with the readLines or stri_read_lines function, where each text line' is represented by a separate string.

Usage

stri_stats_general(str)

Arguments

str
character vector to be aggregated

Value

  • Returns an integer vector with the following named elements:
    1. Lines- number of lines (number of non-missing strings in the vector);
    2. LinesNEmpty- number of lines with at least one non-WHITE_SPACEcharacter;
    3. Chars- total number of Unicode code points detected;
    4. CharsNWhite- number of Unicode code points that are notWHITE_SPACEs;
    5. ... (Other stuff that may appear in future releases ofstringi).

Details

Any of the strings must not contain \r or \n characters, otherwise you will get at error.

Below by `white space` we mean the Unicode binary property WHITE_SPACE, see stringi-search-charclass.

See Also

Other stats: stri_stats_latex

Examples

Run this code
s <- c("Lorem ipsum dolor sit amet, consectetur adipisicing elit.",
       "nibh augue, suscipit a, scelerisque sed, lacinia in, mi.",
       "Cras vel lorem. Etiam pellentesque aliquet tellus.",
       "")
stri_stats_general(s)

Run the code above in your browser using DataLab