The model structure is as follows:
- model S:
chl ~ s(cont_year, k = large)
The cont_year vector is measured as a continuous numeric variable for the annual effect (e.g., January 1st, 2000 is 2000.0, July 1st, 2000 is 2000.5, etc.) and doy is the day of year as a numeric value from 1 to 366. The function s models cont_year as a smoothed, non-linear variable. The optimal amount of smoothing on cont_year is determined by cross-validation as implemented in the mgcv package and an upper theoretical upper limit on the number of knots for k should be large enough to allow sufficient flexibility in the smoothing term. The upper limit of k was chosen as 12 times the number of years for the input data. If insufficient data are available to fit a model with the specified k, the number of knots is decreased until the data can be modelled, e.g., 11 times the number of years, 10 times the number of years, etc.