Outputs a data frame containing track and field data
tf_parse(
file,
avoid = avoid_default,
typo = typo_default,
replacement = replacement_default,
relay_athletes = FALSE,
rounds = FALSE,
round_attempts = FALSE,
split_attempts = FALSE,
splits = FALSE,
split_length = 1
)
a .pdf or .html file (could be a url) where containing track and field results. Must be formatted in a "normal" fashion - see vignette
a list of strings. Rows in file
containing these strings
will not be included. For example "Record:", often used to label records,
could be passed to avoid
. The default is avoid_default
,
which contains many strings similar to "Record:". Users can supply their
own lists to avoid
.
a list of strings that are typos in the original results.
tf_parse
is particularly sensitive to accidental double spaces, so
"Central High School", with two spaces between "Central" and "High" is a
problem, which can be fixed. Pass "Central High School" to typo
.
a list of fixes for the strings in typo
. Here one
could pass "Central High School" (one space between "Central" and "High")
to fix the issue described in typo
should tf_parse
try to include the names of
relay athletes for relay events? Names will be listed in new columns
"Relay-Athlete_1", "Relay_Athlete_2" etc. Defaults to FALSE
.
should tf_parse
try to include rounds for
jumping/throwing events? Please note this will add a significant number of
columns to the resulting data frame. Defaults to FALSE
.
should tf_parse
try to include rounds results
(i.e. "PASS", "X", "O") for high jump and pole value events? Please note
this will add a significant number of columns to the resulting data frame.
Defaults to FALSE
should tf_parse
split attempts from each round
into separate columns? For example "XXO" would result in three columns,
one for "X', another for the second "X" and third for "O". There will be a
lot of columns. Defaults to FALSE
either TRUE
or the default, FALSE
- should
tf_parse
attempt to include splits.
either the distance at which splits are collected
(must be constant distance) or the default, 1
, the length of track
at which splits are recorded. Not all results are internally consistent on
this issue. If in doubt use the default 1
a data frame of track and field results
tf_parse
is meant to be preceded by
read_results
# NOT RUN {
tf_parse(
read_results("https://www.flashresults.com/2018_Meets/Outdoor/05-05_A10/015-1.pdf"),
rounds = TRUE,
round_attempts = TRUE,
split_attempts = TRUE)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab