# Lucy D'Agostino McGowan

#### 10 packages on CRAN

#### 1 packages on GitHub

Analyze lines of R code using tidy principles. This allows you to input lines of R code and output a data frame with one row per function included. Additionally, it facilitates code classification via included lexicons.

The strength of evidence provided by epidemiological and observational studies is inherently limited by the potential for unmeasured confounding. We focus on three key quantities: the observed bound of the confidence interval closest to the null, a plausible residual effect size for an unmeasured continuous or binary confounder, and a realistic mean difference or prevalence difference for this hypothetical confounder. Building on the methods put forth by Lin, Psaty, & Kronmal (1998) <doi:10.2307/2533848>, we can use these quantities to assess how an unmeasured confounder may tip our result to insignificance, rendering the study inconclusive.

Allows printing of character strings as messages/warnings/etc. with ASCII animals, including cats, cows, frogs, chickens, ghosts, and more.

The Datasaurus Dozen is a set of datasets with the same summary statistics. They retain the same summary statistics despite having radically different distributions. The datasets represent a larger and quirkier object lesson that is typically taught via Anscombe's Quartet (available in the 'datasets' package). Anscombe's Quartet contains four very different distributions with the same summary statistics and as such highlights the value of visualisation in understanding data, over and above summary statistics. As well as being an engaging variant on the Quartet, the data is generated in a novel way. The simulated annealing process used to derive datasets from the original Datasaurus is detailed in "Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing" <doi:10.1145/3025453.3025912>.

Conveniently log everything you type into the R console. Logs are are stored as tidy data frames which can then be analyzed using 'tidyverse' style tools.

Provides a client for (1) querying the DHS API for survey indicators and metadata (<https://api.dhsprogram.com/#/index.html>), (2) identifying surveys and datasets for analysis, (3) downloading survey datasets from the DHS website, (4) loading datasets and associate metadata into R, and (5) extracting variables and combining datasets for pooled analysis.

Tools for quantitative assessment of medical imaging systems, radiologists or computer aided detection ('CAD') algorithms. Implements methods described in the book: 'Chakraborty' (2017) <ISBN:978-1482214840>. Data collection paradigms include receiver operating characteristic ('ROC') and a location specific extension, namely free-response 'ROC' ('FROC'). 'ROC' data consists of a single rating per image, where the rating is the perceived confidence level the image is of a diseased patient. 'FROC' data consists of a variable number (including zero) of mark-rating pairs per image, where a mark is the location of a clinically relevant suspicious region and the rating is the corresponding confidence level that it is a true lesion. The name 'RJafroc' is derived from it being an enhanced R version of original Windows 'JAFROC' <http://www.devchakraborty.com>. Implemented are a number of figures of merit quantifying performance, functions for visualizing operating characteristics and three ROC ratings data curve-fitting algorithms: the 'binormal' model ('BM'), the contaminated 'binormal' model ('CBM') and the 'radiological' search model ('RSM') 'Chakraborty' (2006) <{doi:10.1088/0031-9155/51/14/012}> . Also implemented is maximum likelihood fitting of paired ROC data, utilizing the correlated 'CBM' model ('CORCBM') model. Unlike the 'BM', which predicts 'improper' ROC curves, 'CBM', 'CORCBM' and the 'RSM' predict proper ROC curves that do not cross the chance diagonal. 'RSM' fitting yields measures of search and lesion-classification performances, in addition to the usual case-classification performance measured by the area under the 'ROC' curve. Search performance is the ability to find lesions while avoiding finding non-lesions. Lesion-classification performance is the ability to discriminate between found lesions and non-lesions. A number of significance testing algorithms are implement. For fully-crossed factorial study designs, termed multiple-reader multiple-case, significance testing of reader-averaged figure-of-merit differences between 'modalities' is implemented using either 'pseudovalue'-based or figure of merit-based methods. Single treatment analysis allows comparison of performance of a group of radiologists to a specified value, or comparison of 'CAD' performance to a group of radiologists interpreting the same cases. Sample size estimation tools are provided for 'ROC' and 'FROC' studies that allow estimation of relevant variances from a pilot study, in order to predict required numbers of readers and cases in a pivotal study. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files.

Create HTML5 slides with R Markdown and the JavaScript library 'remark.js' (<https://remarkjs.com>).