This function performs Z-score normalization on high-throughput screening data using negative control samples as reference. The Z-score transformation standardizes the data by centering and scaling each column (readout) based on the mean and standard deviation of negative control samples.
Zscore(countMat, negGene)A Z-score normalized matrix with the same dimensions as the input countMat (excluding the Type column added during processing). Each value represents how many standard deviations away from the negative control mean that particular gene/readout combination is.
A matrix of raw count data where rows represent genes/siRNAs and columns represent readouts/conditions. The matrix should have row names corresponding to gene/siRNA identifiers.
A data frame or matrix containing negative control gene/siRNA identifiers. The first column should contain the gene/siRNA names that match the row names in countMat.
Yajing Hao, Shuyang Zhang, Junhui Li, Guofeng Zhao, Xiang-Dong Fu
The function performs Z-score normalization as follows:
Extracts negative control samples from the input matrix using the identifiers provided in negGene
For each column (readout), calculates the mean and standard deviation using only the negative control samples
Applies Z-score transformation: \(Z_{ij} = (X_{ij} - \mu_{j}) / \sigma_{j}\) where \(X_{ij}\) is the raw value for gene \(i\) in readout \(j\), \(\mu_{j}\) is the mean of negative controls in readout \(j\), and \(\sigma_{j}\) is the standard deviation of negative controls in readout \(j\)
This normalization allows for comparison across different readouts and identifies genes/siRNAs that show significant deviation from the negative control distribution.
data(countMat)
data(negGene)
ZscoreVal <- Zscore(countMat, negGene)
ZscoreVal[1:5, 1:5]
Run the code above in your browser using DataLab