rm_time: Remove/Replace/Extract Time

Description

rm_time - Remove/replace/extract time from a string. rm_transcript_time - Remove/replace/extract transcript specific time stamps from a string. as_time - Convert a time stamp removed by rm_time or rm_transcript_time to a standard time format (HH:SS:MM.OS) and optionally convert to as.POSIXlt. as_time - A convenience function for as_time that unlists and returns a vector rather than a list.

Usage

rm_time(text.var, trim = !extract, clean = TRUE, pattern = "@rm_time",
  replacement = "", extract = FALSE,
  dictionary = getOption("regex.library"), ...)

rm_transcript_time(text.var, trim = !extract, clean = TRUE,
  pattern = "@rm_transcript_time", replacement = "", extract = FALSE,
  dictionary = getOption("regex.library"), ...)

as_time(x, as.POSIXlt = FALSE, millisecond = TRUE)

as_time2(x, ...)

Arguments

text.var

The text variable.

trim

logical. If TRUE removes leading and trailing white spaces.

clean

trim logical. If TRUE extra white spaces and escaped character will be removed.

pattern

A character string containing a regular expression (or character string for fixed = TRUE) to be matched in the given character vector (see Details for additional information). Default, @rm_time uses the rm_time

replacement

Replacement for matched pattern.

extract

logical.  If TRUE the times are extracted into a
list of vectors.

dictionary

A dictionary of canned regular expressions to search within
if pattern begins with "@rm_".

x

A list with extracted time stamps.

as.POSIXlt

logical.  If TRUE the output will be converted to
as.POSIXlt.

millisecond

logical.  If TRUE milliseconds are retained.  If
FALSE they are rounded and added to seconds.

...

Other arguments passed to gsub.

`Value`

Returns a character string with time removed.

`Details`

The default regular expression used by rm_time finds
time with no AM/PM.  This behavior can be altered by using a
secondary regular expression from the regex_usa
data (or other dictionary) via (pattern = "@rm_time2". See
Examples for example usage.

`References`

The time regular expression was taken from:
http://stackoverflow.com/a/25111133/1000343

`See Also`

gsub,
stri_extract_all_regex

Other rm_.functions: as_numeric,
  as_numeric2, rm_number;
  rm_abbreviation; rm_angle,
  rm_bracket,
  rm_bracket_multiple,
  rm_curly, rm_round,
  rm_square; rm_between,
  rm_between_multiple;
  rm_caps_phrase; rm_caps;
  rm_citation_tex; rm_citation;
  rm_city_state_zip;
  rm_city_state; rm_date;
  rm_default; rm_dollar;
  rm_email; rm_emoticon;
  rm_endmark; rm_hash;
  rm_nchar_words; rm_non_ascii;
  rm_non_words; rm_percent;
  rm_phone; rm_postal_code;
  rm_repeated_characters;
  rm_repeated_phrases;
  rm_repeated_words; rm_tag;
  rm_title_name;
  rm_twitter_url, rm_url;
  rm_white, rm_white_bracket,
  rm_white_colon,
  rm_white_comma,
  rm_white_endmark,
  rm_white_lead,
  rm_white_lead_trail,
  rm_white_multiple,
  rm_white_punctuation,
  rm_white_trail; rm_zip

Other rm_.functions: as_numeric,
  as_numeric2, rm_number;
  rm_abbreviation; rm_angle,
  rm_bracket,
  rm_bracket_multiple,
  rm_curly, rm_round,
  rm_square; rm_between,
  rm_between_multiple;
  rm_caps_phrase; rm_caps;
  rm_citation_tex; rm_citation;
  rm_city_state_zip;
  rm_city_state; rm_date;
  rm_default; rm_dollar;
  rm_email; rm_emoticon;
  rm_endmark; rm_hash;
  rm_nchar_words; rm_non_ascii;
  rm_non_words; rm_percent;
  rm_phone; rm_postal_code;
  rm_repeated_characters;
  rm_repeated_phrases;
  rm_repeated_words; rm_tag;
  rm_title_name;
  rm_twitter_url, rm_url;
  rm_white, rm_white_bracket,
  rm_white_colon,
  rm_white_comma,
  rm_white_endmark,
  rm_white_lead,
  rm_white_lead_trail,
  rm_white_multiple,
  rm_white_punctuation,
  rm_white_trail; rm_zip

`Examples`

Run this codex <-  c("R uses 1:5 for 1, 2, 3, 4, 5.",
    "At 3:00 we'll meet up and leave by 4:30:20",
    "We'll meet at 6:33.", "He ran it in :22.34")

rm_time(x)
rm_time(x, extract=TRUE)

## With AM/PM
x <- c(
    "I'm getting 3:04 AM just fine, but...",
    "for 10:47 AM I'm getting 0:47 AM instead.",
    "no time here",
    "Some time has 12:04 with no AM/PM after it",
    "Some time has 12:04 a.m. or the form 1:22 pm"
)

rm_time(x, extract=TRUE)
rm_time(x, pat="@rm_time2", extract=TRUE)
rm_time(x, pat="@rm_time2")
rm_time(x, pat=pastex("@rm_time2", "@rm_time"), extract=TRUE)

# Convert to standard format
as_time(rm_time(x, extract=TRUE))
as_time(rm_time(x, extract=TRUE), as.POSIXlt = TRUE)
as_time(rm_time(x, extract=TRUE), as.POSIXlt = FALSE, millisecond = FALSE)

# Transcript specific time stamps
x2 <-c(
    '08:15 8 minutes and 15 seconds	00:08:15.0',
    '3:15 3 minutes and 15 seconds	not 1:03:15.0',
    '01:22:30 1 hour 22 minutes and 30 seconds	01:22:30.0',
    '#00:09:33-5# 9 minutes and 33.5 seconds	00:09:33.5',
    '00:09.33,75 9 minutes and 33.5 seconds	00:09:33.75'
)

rm_transcript_time(x2)
(out <- rm_transcript_time(x2, extract=TRUE))

as_time(out)
as_time(out, TRUE)
as_time(out, ,FALSE)

if (!require("pacman")) install.packages("pacman")
pacman::p_load(chron)
lapply(as_time(out), chron::times)
lapply(as_time(out, , FALSE), chron::times)
Run the code above in your browser using DataLab