read_results
into a data frameTakes the output of read_results
and cleans it, yielding a data frame
of swimming (and diving) results
swim_parse_omega(
file_omega,
avoid_omega = avoid,
typo_omega = typo,
replacement_omega = replacement,
format_results = TRUE,
splits = FALSE,
split_length_omega = split_length,
relay_swimmers_omega = relay_swimmers
)
returns a data frame with columns Name
, Place
,
Age
, Team
, Prelims
, Finals
,
Points
, Event
& DQ
. Note all swims will have a
Finals
, even if that time was actually swam in the prelims
(i.e. a swimmer did not qualify for finals). This is so that final results
for an event can be generated from just one column.
output from read_results
a list of strings. Rows in file_omega
containing
these strings will not be included. For example "Pool:", often used to
label pool records, could be passed to avoid_omega
. The default is
avoid_default
, which contains many strings similar to "Pool:", such
as "STATE:" and "Qual:". Users can supply their own lists to
avoid_omega
. avoid_omega
is handled before typo_omega
and replacement_omega
.
a list of strings that are typos in the original results.
swim_parse
is particularly sensitive to accidental double spaces, so
"Central High School", with two spaces between "Central" and "High" is a
problem, which can be fixed. Pass "Central High School" to
typo_omega
. Unexpected commas as also an issue, for example "Texas,
University of" should be fixed using typo_omega
and
replacement_omega
a list of fixes for the strings in
typo_omega
. Here one could pass "Central High School" (one space
between "Central" and "High") and "Texas" to replacement_omega
fix
the issues described in typo_omega
should the results be formatted for analysis (special
strings like "DQ"
replaced with NA
, Finals
as
definitive column)? Default is TRUE
either TRUE
or the default, FALSE
- should
swim_parse
attempt to include splits.
either 25
or the default, 50
, the
length of pool at which splits are recorded. Not all results are
internally consistent on this issue - some have races with splits by 50 and
other races with splits by 25.
should names of relay swimmers be captured?
Default is FALSE
swim_parse_omega
must be run on the output of
read_results