This function prepares case count data for use with EpiEstim by performing a series of validation and cleaning steps:
clean_sample_data(data, start_date)A cleaned data frame filtered from start_date, starting at the
first date with non-zero confirmed cases, and containing at least 14 days
of data.
A data frame containing at least the columns "date" and
"confirm". The "date" column should be of class Date, and
"confirm" should be numeric.
A Date (or date-convertible string) indicating the
starting date for analysis. Must exist within the "date" column.
Ensures that the input data frame has the required columns:
"date" and "confirm".
Confirms that the specified start_date exists in the data and filters
the data to include only records on or after that date.
Removes leading days before the first non-zero confirmed case.
Verifies that the resulting dataset contains at least 14 valid days (as required for estimation).
This function is primarily intended as a preprocessing step for EpiEstim modeling. It combines validation checks for input structure and time coverage with minimal data cleaning logic to ensure robust downstream estimation.