This function uses original data to estimate new predicted values and compare them with observations. If exist large differences, removes the original values.
qcPrec(prec, sts, inidate, enddate, parallel = TRUE, ncpu = 2,
printmeta = TRUE, thres = NA)
A new file called cleaned.RData
will be created in working directory. The load of this file (load('cleaned.RData')
) will add a matrix with the original data filtered by quality control. If printmeta = TRUE
, a new meta
directory will be created in working path with one file per day. Each file contains a data.frame
with many rows as flagged data in that day. The columns show the identifier (ID
)of each station; the date
; the criteria code
through the data was flagged and the removed data
. There are five different codes referred to the five criteria: 1 = Suspect data; 2 = Suspect zero; 3 = Suspect outlier; 4 = Suspect wet and 5 = Suspect dry.
Object of class matrix
containing the original precipitation data. Each column represents one station. The names of columns have to be names of the stations.
Object of class matrix
containing the stations info. Must have at least four fields: ID
: station identifier; ALT
: altitude; X
: Longitude in UTM projection (meters); and Y
: Latitude in UTM projection (meters). Tabulation separated.
Object of class Date
in format 'YYYY-mm-dd'
defining the first day of quality control process
Object of class Date
in format 'YYYY-mm-dd'
defining the last day of quality control process
Logical. When TRUE
, parallel computing is activated and the processes will be distributed among the ncpu
number of processor cores.
Only when parallel = TRUE
. Sets the number of processor cores used to parallel computing.
When TRUE
, one file per day will be written in subdirectory ./meta
.
Threshold applied to search nearest stations. If thres=NA
the function will search 10 nearest observations without a distance limit. A positive number indicates the threshold in kilometers.
Roberto Serrano-Notivoli
The process of quality control uses five criteria to flag suspect data. All of them are based on the calculation of reference values (RV) made with the 10 nearest observations (NNS) that day. For this reason, a minimum of 11 available data by day is mandatory. The five criteria are : 1) Suspect data: Observed > 0 & all their 10 NNS == 0; 2) Suspect zero: Observed == 0 & all their 10 NNS > 0; 3) Suspect outlier: Observed is 10 times higher or lower than RV; 4) Suspect wet: Observed == 0, wet probability is over 99%, and predicted magnitude is over 5 litres and 5) Suspect dry: Observed > 5 litres, dry probability is over 99%, and predicted magnitude is under 0.1 litres.
All of these criteria are prepared to work with precipitation in tenths (milimetres*10).
#loads example data
data(precipDataset)
#runs function
qcPrec(prec=preci,sts=sts,inidate=as.Date('2001-01-01'),
enddate=as.Date('2001-01-02'),parallel=TRUE,ncpu=2,printmeta=TRUE,thres=NA)
Run the code above in your browser using DataLab