Learn R Programming

SwimmeR

SwimmeR is intended to assist those working with times from competitive pool swimming races, such as those conducted under the NHFS, NCAA, ISL, or FINA rules. For more information please see vignette("SwimmeR").

Latest Released Version from CRAN

install.packages("SwimmeR")

library(SwimmeR)

Latest Development Version from Github

Version 0.14.2

  • function make_lineup will take two data frames containing athlete/event/time combinations (one for each team) and create a lineup maximizing returns for one team
  • swim_parse handles some Hytek psych sheets (single column only)
  • read_results now handles both pdf and html results at .aspx addresses
  • swim_parse handles Hytek Top Times reports via toptimes_parse_hytek. Still under development.
  • new function place supersedes swim_place and dive_place, handling both swimming and diving
  • major change swim_parse output columns Finals_Time and Prelims_Time have been renamed Finals and Prelims

devtools::install_github("gpilgrim2670/SwimmeR", build_vignettes = TRUE)

Usage

SwimmeR has two major uses - importing results and formatting times. It also has functions for course conversions and drawing brackets.

Importing Results

SwimmeR reads swimming results into R and outputs tidy data frames of the results. SwimmeR uses read_results to read in either a PDF or HTML file (like a url) and the swim_parse or swim_parse_ISL function to convert the read file to a tidy data frame. Reading .hy3 files is also now possible with swim_parse, although .hy3 functionality is still under development and quite buggy. As of version 0.7.0 SwimmeR can also read S.A.M.M.S. style results.

read_results has two arguments, file, which is the file path to read in, and node, required only for HTML files, this is a CSS node where the results reside. node defaults to "pre", which has been correct in every instance tested thus far.

swim_parse has seven arguments as of version 0.7.0.

file is the output of read_results and is required.

avoid is a list of strings. Rows in file containing any of those strings will not be included. avoid is optional. Incorrectly specifying it may lead to nonsense rows in the final data frame, but will not cause an error. Nonsense rows can be removed after import.

typo and replacement work together to fix typos, by replacing them with replacements. Strings in typo will be replaced by strings in replacement in element index order - that is the first element of typo will be replaced everywhere it appears by the first element of replacement. Typos can cause lost data and nonsense rows.

See ?swim_parse or the package vignette for more information.

The following three arguments are only available in SwimmeR v0.6.0 and higher

splits and split_length tell swim_parse if and how to import split times. Setting splits = TRUE will import splits as columns. split_length refers to the pool course (length) as defaults to 50. It may also be set to 25, if splits are recorded every 25 rather than every 50. Split reporting within source files is very inconsistent, so while swim_parse will import whatever splits are present they may require some inspection after import. swim_parse_ISL also has a splits argument that works the same way. Set splits = TRUE to record splits. See the Splits sections of vignette("SwimmeR") for more information and examples.

relay_swimmers tells swim_parse or swim_parse_ISL whether or not to include the names of relay swimmers as additional columns. Set relay_swimmers = TRUE to include. There is more information available in vignette("SwimmeR")

swim_parse(
    read_results(
      "http://www.nyhsswim.com/Results/Boys/2008/NYS/Single.htm"
    ),
    typo = c("-1NORTH ROCKL"),
    replacement = c("1-NORTH ROCKL"),
    splits = TRUE, # requires version 0.6.0 or greater
    relay_swimmers = TRUE # requires version 0.6.0 or greater
  )

swim_parse_ISL only requires one argument, file, the output of read_results.

swim_parse_ISL(
    file = read_results(
      "https://isl.global/wp-content/uploads/2019/10/isl-indianapols-results-day-2-2.pdf"),
      splits = TRUE, # requires version 0.6.0 or greater
      relay_swimmers = TRUE # requires version 0.6.0 or greater
  )

Imported Information

swim_parse will attempt to capture the following information, assuming it is present in the raw results.

Place: Order of finish

Name: An athlete's name. Relays do not have names.

Age: Could be a number of years (25) or a year in school (SR)

Para: An athlete's para-swimming classification (e.g. S10)

Team: The name of a team, for athletes or relays

Prelims_Time: If two times/scores are listed, this is the first one. swim_parse currently can't differentiate between a seed time and a prelims time. They're both called Prelims_Time. Prelim/seed diving scores are also included here even though they're not technically times.

Finals_Time: If two times/scores are listed this is the second one. If only one time/score is listed this is it.

DQ: Was an athlete/relay team disqualified (1) or not (0)

Exhibition: Was an athlete/relay team competing as a non-scoring (exhibition) entry (1) or not (0)

Points: Points award based on place (not diving score)

Relay_Swimmer_X: Names of athletes in a relay

Split_X: Split corresponding to a given distance X

Usable Formats

SwimmeR can only read files in single column format, not double.

Will work - results in single column

Will also work - results in single column

Will not work - results in multiple columns

Formatting Times

SwimmeR also converts times between the conventional swimming format of minutes:seconds.hundredths (1:35.37) and the computationally useful format of seconds, reported to the 100ths place (e.g. 95.37). This is accomplished with sec_format and mmss_format, which are inverses of one another. Both sec_format and mmss_format work well with tidyverse functions.

times <- c("1:35.97", "57.34", "16:53.19", NA)
times_sec <- sec_format(times)
times_sec
times_mmss <- mmss_format(times_sec)
times_mmss
all.equal(times, times_mmss)

Regularizing Team Names

Team names are often abbreviated. Rather than specifying every abbreviation SwimmeR provides get_mode to make the task simpler.

name <- c(rep("Lilly King", 5), rep("James Sullivan", 3))
team <- c(rep("IU", 2), "Indiana", "IUWSD", "Indiana University", rep("Monsters University", 2), "MU")
df <- data.frame(name, team, stringsAsFactors = FALSE)
df %>% 
  group_by(name) %>% 
  mutate(Team = get_mode(team))

Reordering Athlete Names

Athlete names are sometimes formatted as "Firstname Lastname" and sometimes as "Lastname, Firstname". For purposes of plotting and presentation it's often desirable to format all names the same way. The name_reorder function, available in versions >= 0.8.0, will reorder all "Lastname, Firstname" names as "Firstname Lastname".

df <- data.frame(Name = c("King, Lilly", "Lilly King", NA, "Richards Ross, Sanya", "Phelps, Michael F"))
name_reorder(df)

While "Lastname, Firstname" is actually more informative in that it differentiates between last names and first names it's not always possible to convert "Firstname Lastname" to "Lastname, Firstname". Consider an athlete named "Michael Fred Phelps II" - it's not possible to determine programmatically where a comma should go. Is it "II, Michael Fred Phelps"? Or maybe "Fred Phelps II, Michael"? There's no way to tell. On the other hand converting "Phelps II, Michael Fred" to "Michael Fred Phelps II" is straightforward.

Drawing brackets

Brackets for single elimination tournaments can be produced for any number of teams between 5 and 64. Byes will automatically be included for higher seeds as required.

teams <- c("red", "orange", "yellow", "green", "blue", "indigo", "violet")
round_two <- c("red", "yellow", "blue", "indigo")
round_three <- c("red", "blue")
champion <- "red"
draw_bracket(teams = teams,
            round_two = round_two,
            round_three = round_three,
            champion = champion)

Course conversions

Additionally 'SwimmeR' also converts between the various pool sizes used in competitive swimming, namely 50m length (LCM), 25m length (SCM) and 25y length (SCY). This is accomplished with course_convert. The verbose parameter determines what course_convert outputs. Setting verbose = FALSE (the default) returns a data frame including the input variables whereas verbose = TRUE only returns the converted time(s). course_convert will take inputs in either seconds or swimming format.

swim <- tibble(time = c("6:17.53", "59.14", "4:14.32", "16:43.19"), course = c("LCM", "LCM", "SCY", "SCM"), course_to = c("SCY", "SCY", "SCM", "LCM"), event = c("400 Free", "100 Fly", "400 IM", "1650 Free"))

course_convert(time = swim$time, course = swim$course, course_to = swim$course_to, event = swim$event)

course_convert(time = swim$time, course = swim$course, course_to = swim$course_to, event = swim$event, verbose = TRUE)

Getting help

I do a lot of demos on how to use SwimmeR at my blog Swimming + Data Science.

SwimmeR also has a vignette. Call vignette("SwimmeR"). If you download from Github don't forget to set build_vignettes = TRUE.

If you find bug, please provide a minimal reproducible example at Github.

Copy Link

Version

Install

install.packages('SwimmeR')

Monthly Downloads

246

Version

0.14.2

License

MIT + file LICENSE

Maintainer

Greg Pilgrim

Last Published

March 24th, 2023

Functions in SwimmeR (0.14.2)

age_format

Formatting yyy-mm ages as years
SwimmeR-defunct

Defunct functions in SwimmeR
course_convert_DF

Course converter, returns data frame - defunct
fill_down

Fills NA values with previous non-NA value
collect_relay_swimmers_splash

Collects relay swimmers as a data frame within swim_parse_splash
collect_relay_swimmers_omega

Collects relay swimmers as a data frame within swim_parse_omega
coalesce_many

Combined paired sets of columns following a join operation
coalesce_many_helper

Combined paired sets of columns following a join operation
course_convert

Swimming Course Converter
discard_errors

Discards elements of list that have an error value from purrr::safely.
course_convert_helper

Swimming Course Convertor Helper
collect_relay_swimmers_old

Collects relay swimmers as a data frame within swim_parse_old
collect_relay_swimmers

Collects relay swimmers as a data frame within swim_parse
dive_place

Adds places to diving results
draw_bracket

Creates a bracket for tournaments involving 5 to 64 teams, single elimination
get_mode

Find the mode (most commonly occurring) element of a list
hytek_clean_strings

Cleans input strings
generate_row_to_add

Create a one-line data frame containing an entry to be appended to an in-progress data frame of all entries
hy3_times

Helper for reading prelims and finals times from Hy-Tek .hy3 files
%notin%

"Not in" function
fill_left

Shifts non-NA values to left in data frame
interleave_results

Helper for reading interleaving prelims and finals results
hytek_length_9_sort

Sort data in lists of length 9 within hytek_swim_parse
is_link_broken

Determines if a link is valid
hytek_length_3_sort

Sort data in lists of length 3 within hytek_swim_parse
list_breaker

Breaks out lists of lists by sub-list length
hytek_length_3_DQ_sort

Sort data in DQ lists of length 3 within hytek_swim_parse
correct_split_distance

Changes lengths associated with splits to new values
correct_split_distance_helper

Changes lengths associated with splits to new values
lines_sort

Sorts and collects lines by performance and row number
reaction_times_parse

Pulls out reaction times from text
place

Add places to results
hytek_length_4_DQ_sort

Sort data in DQ lists of length 4 within hytek_swim_parse
make_lineup

Determine optimal entries against a given opponent lineup
list_transform

Transform list of lists into data frame
heat_parse_omega

Pulls out heat labels from text
hytek_length_4_sort

Sort data in lists of length 4 within hytek_swim_parse
mmss_format

Formatting seconds as mm:ss.hh
na_pad

Pads shorter lists in a list-of-lists with NAs such that all lists are the same length
splash_length_6_sort

Sort data in lists of length 6 within spash_swim_parse
read_htm

Read in html files of swimming results
splash_length_9_sort

Sort data in lists of length 9 within spash_swim_parse
splash_length_8_sort

Sort data in lists of length 8 within spash_swim_parse
splits_parse_omega_relays

Collects splits for relays within swim_parse_omega
splash_length_7_sort

Sort data in lists of length 7 within spash_swim_parse
read_hy3

Read in hy3 files of swimming results
splits_parse_splash

Collects splits within swim_parse_splash for Splash results
event_parse

Pulls out event labels from text
list_to_list_names

Initialize a named list of lists
event_parse_ISL

Pulls out event labels from text
splash_clean_strings

Cleans input strings
splash_collect_splits

Collects Splash format splits
fold

Fold a vector onto itself
splits_parse_splash_relays

Collects splits for relays within swim_parse_splash
splits_reform

Adds together splits and compares to listed finals time to see if they match.
swim_parse_hytek

Formats Hytek style swimming and diving data read with read_results into a data frame
swim_parse_omega

Formats Omega style swimming and diving data read with read_results into a data frame
swim_parse_samms

Formats swimming and diving data read with read_results into a dataframe
splits_rename_omega

Advances split names by one split_length
splits_to_cumulative

Converts splits from lap to cumulative format
results_score

Scores a swim meet
splash_length_11_sort

Sort data in lists of length 11 within splash_swim_parse
splash_length_12_sort

Sort data in lists of length 12 within splash_swim_parse
replacement_entries

Replaces superseded rows
splits_to_cumulative_helper_recalc

Helper function for converting lap splits to cumulative splits
swim_parse_splash

Formats Splash style swimming and diving data read with read_results into a data frame
swim_place

Add places to swimming results
hytek_length_6_sort

Sort data in lists of length 6 within hytek_swim_parse
hy3_places

Helper for reading prelims and finals places from Hy-Tek .hy3 files
hytek_length_5_sort

Sort data in lists of length 5 within hytek_swim_parse
name_reorder

Orders all names as "Firstname Lastname"
hy3_parse

Parses Hy-Tek .hy3 files
splits_to_lap

Converts splits from cumulative to lap format
format_results

Formats data for analysis within swim_parse
make_lineup_helper

Determine optimal entries against a given opponent lineup
hytek_length_8_sort

Sort data in lists of length 8 within hytek_swim_parse
swim_parse_old

Formats swimming and diving data read with read_results into a data frame
hytek_length_7_sort

Sort data in lists of length 7 within hytek_swim_parse
sec_format

Formatting mm:ss.tt times as seconds
tie_rescore

Rescore to account for ties
splash_determine_indent_length

Determines indent length for data within swim_parse_splash
splits_parse

Collects splits within swim_parse
sec_format_helper

Helper function for formatting mm:ss.hh times as seconds, used to enable vectorized operation of sec_format
splits_parse_ISL

Collects splits within swim_parse_ISL
make_lineup_helper_2

Assign overpowered entries
splash_length_10_sort

Sort data in lists of length 10 within splash_swim_parse
toptimes_parse_hytek

Formats Hytek style swimming and diving Top Times reports read with read_results into a data frame
%>%

Pipe operator
read_results_flag

used to indicate that results have been read in with read_results prior to being parsed by swim_parse
read_pdf

Read in pdf files of swimming results
splash_length_4_sort

Sort data in lists of length 4 within spash_swim_parse
splash_length_5_sort

Sort data in lists of length 5 within spash_swim_parse
splits_parse_splash_helper_1

Produces data frames of splits within swim_parse_splash for Splash results
splits_to_lap_helper_recalc

Helper function for converting cumulative splits to lap splits
splits_parse_splash_helper_2

Produces data frames of splits within swim_parse_splash for Splash results
undo_interleave

Undoes interleaving of lists
swim_parse_ISL

Formats swimming results from the International Swim League ('ISL') read with read_results into a data frame
update_rank_helper

Create a one-line data frame containing an entry to be appended to an in-progress data frame of all entries
age_format_helper

Helper function for formatting yyy-mm ages as years, enables vectorization of age_format
King200Breast

Results for Lilly King's 200 Breaststrokes
Read_Results

Reads swimming and diving results into a list of strings in preparation for parsing with swim_parse
add_event_dummy_row

Add dummy entry rows
clean_events

Regularizes event names
SwimmeR-deprecated

Deprecated functions in SwimmeR
add_row_numbers

Add row numbers to raw results
Swim_Parse

Formats swimming and diving data read with read_results into a data frame