# Strings

`knitr::opts_chunk$set(echo = TRUE, comment = "#>")`

First let's load the library:

`library(filesstrings)`

## The *n*^th^ number in a string

I often want to get the first, last or *n*^th^ number in a string.

```
pop <- "A population of 1000 comprised of 488 dogs and 512 cats."
first_number(pop)
nth_number(pop, 2)
last_number(pop)
```

## All the numbers in a string

`extract_numbers(pop)`

## All the non-numbers in a string

`extract_non_numerics(pop)`

## Split strings by numbers

`str_split_by_nums(pop)`

## Could that be interpreted as numeric?

Sometimes we don't want to know is something *is* numeric, we want to know if it could be considered to be numeric (or could be coerced to numeric). For this, there's `can_be_numeric()`

.

```
is.numeric(23)
is.numeric("23")
can_be_numeric(23)
can_be_numeric("23")
can_be_numeric("23a")
```

## Get the *n*^th^ element of a string

```
str_elem("abc", 2)
str_elem("abcdefz", -1) # last element
```

## Trim anything (not just whitespace)

`stringr`

's `str_trim`

just trims whitespace. What if you want to trim something else? Now you can `trim_anything()`

.

`trim_anything("__rmarkdown_", "_")`

## Count the number of matches of a pattern in a string

```
count_matches(pop, " ") # count the spaces in pop
count_matches("Bob and Joe went to see Bob's mother.", "Bob")
```

## Turn duplicates of a pattern into singles

Suppose we want to remove double spacing:

```
double__spaced <- "Hello world, pretend it's Saturday :-)"
count_matches(double__spaced, " ") # count the spaces
single_spaced <- singleize(double__spaced, pattern = " ")
single_spaced
count_matches(single_spaced, " ") # half the spaces are gone
```

## The bit of a string after the *n*^th^ appearance of a pattern

Suppose we have sentences telling us about a couple of boxes:

```
box_infos <- c("Box 1 has weight 23kg and volume 0.3 cubic metres.",
"Box 2 has weight 20kg and volume 0.33 cubic metres.")
```

We can get (for example) the weights of the boxes by taking the first number that appears after the word "weight".

```
library(magrittr)
str_after_nth(box_infos, "weight", 1) # the bit of the string after 1st "weight"
str_after_nth(box_infos, "weight", 1) %>% nth_number(1) # 1st number after 1st "weight"
```

We'd like to put all of the box information into a nice data frame. Here's how.

```
tibble::tibble(box = nth_number(box_infos, 1),
weight = str_after_nth(box_infos, "weight", 1) %>%
nth_number(1, decimals = TRUE),
volume = str_after_nth(box_infos, "volume", 1) %>%
nth_number(1, decimals = TRUE)
)
```

## Split camel case

Sometimes people use camel case (CamelCase) to avoid using spaces. What if we want to put the spaces back in?

```
camel_names <- c("JoeBloggs", "JaneyMac")
str_split_camel_case(camel_names)
```