ScottKnottESD
The Scott-Knott Effect Size Difference (ESD) test is an enhancement of the Scott-Knott test (which clusters distributions into statistically distinct ranks) that takes effect size into consideration [Tantithamthavorn et al., (2017) http://dx.doi.org/10.1109/TSE.2016.2584050].
Example usage scenarios in software engineering domain.
(1) Ranking and identifying the most influential variables that are produced by random forests models or regression models.
Kabinna et al. "Examining the stability of logging statements." Proceedings of the International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2016.
Li et al. "Towards just-in-time suggestions for log changes." Empirical Software Engineering (2016): 1-35.
Tian et al. "What are the characteristics of high-rated apps? a case study on free android applications." Proceedings of the International Conference onSoftware Maintenance and Evolution (ICSME), 2015.
Tantithamthavorn et al. "The impact of mislabelling on the performance and interpretation of defect prediction models." Proceedings of the International Conference on Software Engineering (ICSE), 2015.
(2) Ranking and identifying the top-performing feature selection, classification, and model validation techniques for defect prediction models.
Rajbahadur et al. "The Impact Of Using Regression Models to Build Defect Classifiers." Proceedings of the International Conference on Mining Software Repositories (MSR), 2017.
Ghotra et al. "A Large-Scale Study of the Impact of Feature Selection Techniques on Defect Classification Models" Proceedings of the International Conference on Mining Software Repositories (MSR), 2017.
Tantithamthavorn et al. "An Empirical Comparison of Model Validation Techniques for Defect Prediction Models." IEEE Transactions on Software Engineering (TSE), 2017.
Tantithamthavorn et al. "Automated parameter optimization of classification techniques for defect prediction models." Proceedings of the 38th International Conference on Software Engineering (ICSE), 2016.
Ghotra et al. "Revisiting the impact of classification techniques on the performance of defect prediction models." Proceedings of the International Conference on Software Engineering (ICSE), 2015.
(3) Ranking and identifying the most frequent developer search tasks.
- Xia et al. "What do developers search for on the web?" Empirical Software Engineering (2017): 1-37.
Installation
Install the current release from CRAN::
install.packages("ScottKnottESD")
Install the development version from GitHub:
install.packages("devtools")
devtools::install_github("klainfo/ScottKnottESD")
Example Usage
library(ScottKnottESD)
# An example dataset: The 1,000 variable importance scores of 9 software metrics.
# The scores are generated by the Random Forests technique using 1,000 out-of-sample bootstrap.
example
sk <- sk_esd(example)
sk$original # Original Groups
sk$groups # Corrected Groups with effect size wise
sk$reverse # Reversed Groups
Referencing ScottKnottESD
ScottKnottESD can be referenced as:
@article{tantithamthavorn2017tse,
Author={Tantithamthavorn, Chakkrit and McIntosh, Shane and Hassan, Ahmed E. and Matsumoto, Kenichi},
Title = {An Empirical Comparison of Model Validation Techniques for Defect Prediction Models},
Booktitle = {IEEE Transactions on Software Engineering (TSE)},
Volumn = {43},
Number = {1},
page = {1-18},
Year = {2017}
}
@misc{ScottKnottESD,
title = {{ScottKnottESD: The Scott-Knott Effect Size Difference (ESD) Test}},
author = {Tantithamthavorn, Chakkrit},
year = {2017},
howpublished = {\url{https://cran.r-project.org/package=ScottKnottESD}}
}