Learn R Programming

NeEDS4BigData (version 1.0.1)

New Experimental Design Based Subsampling Methods for Big Data

Description

Subsampling methods for big data under different models and assumptions. Starting with linear regression and leading to Generalised Linear Models, softmax regression, and quantile regression. Specifically, the model-robust subsampling method proposed in Mahendran, A., Thompson, H., and McGree, J. M. (2023) , where multiple models can describe the big data, and the subsampling framework for potentially misspecified Generalised Linear Models in Mahendran, A., Thompson, H., and McGree, J. M. (2025) .

Copy Link

Version

Install

install.packages('NeEDS4BigData')

Monthly Downloads

177

Version

1.0.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Amalan Mahendran

Last Published

October 22nd, 2025

Functions in NeEDS4BigData (1.0.1)

Electric_consumption

Electric consumption data
modelMissPoiSub

Subsampling under Poisson regression for a potentially misspecified model
modelRobustLinSub

Model robust optimal subsampling for A- and L- optimality criteria under linear regression
modelRobustPoiSub

Model robust optimal subsampling for A- and L- optimality criteria under Poisson regression
modelRobustLogSub

Model robust optimal subsampling for A- and L- optimality criteria under logistic regression
modelMissLogSub

Subsampling under logistic regression for a potentially misspecified model
GenModelMissGLMdata

Generate data for Generalised Linear Models under model misspecification scenario
ALoptimalGLMSub

A- and L-optimality criteria based subsampling under Generalised Linear Models
Skin_segmentation

Skin segmentation data
GenGLMdata

Generate data for Generalised Linear Models
AoptimalGauLMSub

A-optimality criteria based subsampling under Gaussian Linear Models
AoptimalMCGLMSub

A-optimality criteria based subsampling under measurement constraints for Generalised Linear Models
LCCsampling

Local case control sampling for logistic regression
LeverageSampling

Basic and shrinkage leverage sampling for Generalised Linear Models
One_million_songs

One million songs data
plot_AMSE

Plotting AMSE outputs for the samples under model misspecification
plot_Beta

Plotting model parameter outputs after subsampling
modelMissLinSub

Subsampling under linear regression for a potentially misspecified model