This function visualizes the relationship between read length and read
quality. The user can choose to plot either the
mean quality score per read or the expected error (EE) rate.
fastq_input can either be a file path to a FASTQ file or a FASTQ
object. FASTQ objects are tibbles that contain the columns Header,
Sequence, and Quality, see readFastq.
The EE rate is calculated as the mean of error probabilities per read, where
the error probability for each base is computed as \(10^{(-Q/10)}\) from
Phred scores. A lower EE rate indicates higher sequence quality, while a
higher EE rate suggests lower confidence in the read.
Marginal histograms are added to display the distribution of read lengths
(top) and quality scores or EE rates (right).
If fastq_input contains more than 10 000 reads, the function will
randomly select 10 000 rows for downstream calculations. This subsampling is
performed to reduce computation time and improve performance on large
datasets.