⚠️There's a newer version (3.1.2) of this package.Take me there.

SparkR (version 2.4.6)

R Front End for 'Apache Spark'

Description

Provides an R Front end for 'Apache Spark' .

Copy Link

Version

Install

install.packages('SparkR')

Monthly Downloads

104

Version

2.4.6

License

Apache License (== 2.0)

Maintainer

Shivaram Venkataraman

Last Published

June 6th, 2020

Functions in SparkR (2.4.6)

ALSModel-class

S4 class that represents an ALSModel

DecisionTreeClassificationModel-class

S4 class that represents a DecisionTreeClassificationModel

AFTSurvivalRegressionModel-class

S4 class that represents a AFTSurvivalRegressionModel

GaussianMixtureModel-class

S4 class that represents a GaussianMixtureModel

BisectingKMeansModel-class

S4 class that represents a BisectingKMeansModel

GBTClassificationModel-class

S4 class that represents a GBTClassificationModel

DecisionTreeRegressionModel-class

S4 class that represents a DecisionTreeRegressionModel

alias

IsotonicRegressionModel-class

S4 class that represents an IsotonicRegressionModel

GroupedData-class

S4 class that represents a GroupedData

LogisticRegressionModel-class

S4 class that represents an LogisticRegressionModel

NaiveBayesModel-class

S4 class that represents a NaiveBayesModel

LinearSVCModel-class

S4 class that represents an LinearSVCModel

broadcast

GeneralizedLinearRegressionModel-class

S4 class that represents a generalized linear model

as.data.frame

Download data from a SparkDataFrame into a R data.frame

LDAModel-class

S4 class that represents an LDAModel

arrange

Arrange Rows by Variables

MultilayerPerceptronClassificationModel-class

S4 class that represents a MultilayerPerceptronClassificationModel

KMeansModel-class

S4 class that represents a KMeansModel

cache

Cache

KSTest-class

S4 class that represents an KSTest

cast

Casts the column to a different data type.

checkpoint

approxQuantile

Calculates the approximate quantiles of numerical columns of a SparkDataFrame

avg

awaitTermination

attach,SparkDataFrame-method

Attach SparkDataFrame to R search path

RandomForestRegressionModel-class

S4 class that represents a RandomForestRegressionModel

RandomForestClassificationModel-class

S4 class that represents a RandomForestClassificationModel

clearCache

Clear Cache

StreamingQuery-class

S4 class that represents a StreamingQuery

SparkDataFrame-class

S4 class that represents a SparkDataFrame

column_datetime_diff_functions

Date time arithmetic functions for Column operations

cancelJobGroup

Cancel active jobs for the specified group

collect

Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.

Clear current job group ID and its description

column_aggregate_functions

Aggregate functions for Column operations

between

coltypes

WindowSpec-class

S4 class that represents a WindowSpec

crosstab

Computes a pair-wise frequency table of the given columns

createDataFrame

Create a SparkDataFrame

column_datetime_functions

Date time functions for Column operations

createExternalTable

(Deprecated) Create an external table

column_collection_functions

Collection functions for Column operations

crossJoin

CrossJoin

column_math_functions

Math functions for Column operations

column

S4 class that represents a SparkDataFrame column

colnames

Column Names of SparkDataFrame

column_nonaggregate_functions

Non-aggregate functions for Column operations

column_string_functions

String functions for Column operations

corr

exceptAll

freqItems

Finding frequent items for columns, possibly with false positives

gapply

except

column_window_functions

Window functions for Column operations

describe

dim

Returns the dimensions of SparkDataFrame

asc

A set of operations working with SparkDataFrame columns

Drops the temporary view with the given view name in the catalog.

dtypes

DataTypes

cube

createOrReplaceTempView

Creates a temporary view using the given name.

createTable

Creates a table based on the dataset in a data source

currentDatabase

Returns the current default database

column_misc_functions

Miscellaneous functions for Column operations

hashCode

Compute the hashCode of an object

Returns the number of rows in a SparkDataFrame

(Deprecated) Drop Temporary Table

dropDuplicates

getLocalProperty

Get a local property set in this thread, or NULL if it is missing. See setLocalProperty.

getNumPartitions

glm,formula,ANY,SparkDataFrame-method

Generalized Linear Models (R-compliant)

insertInto

histogram

Compute histogram statistics for given column

Returns a list of columns for the given table/view in the specified database

fitted

Get fitted result from a k-means model

first

Return the first row of a SparkDataFrame

merge

Merges two data frames

mutate

Mutate

read.orc

Create a SparkDataFrame from an ORC file.

pivot

Pivot a column of the GroupedData and perform the specified aggregation.

read.ml

Load a fitted MLlib model from the input path.

join

Join

listDatabases

Returns a list of databases available

listFunctions

Returns a list of functions registered in the specified database

install.spark

Download and Install Apache Spark to a Local Directory

intersect

Intersect

listTables

Returns a list of tables or views in the specified database

Create a SparkDataFrame from a text file.

recoverPartitions

Recovers all the partitions in the directory of a table and update the catalog

Set checkpoint directory

rbind

Union two or more SparkDataFrames

over

read.df

Load a SparkDataFrame

partitionBy

refreshByPath

Invalidates and refreshes all the cached data and metadata for SparkDataFrame containing path

setCurrentDatabase

Sets the current default database

localCheckpoint

predict

Makes predictions from a MLlib model

print.jobj

Print a JVM object reference.

%in%

Match a column with given values.

read.jdbc

Create a SparkDataFrame representing the database table accessible via JDBC URL

read.json

Create a SparkDataFrame from a JSON file.

spark.als

Alternating Least Squares (ALS) for Collaborative Filtering

spark.addFile

Add a file or directory to be downloaded with this Spark job on every node.

dropna

A set of SparkDataFrame functions working with NA values

rollup

ncol

Returns the number of columns in a SparkDataFrame

rowsBetween

spark.bisectingKmeans

Bisecting K-Means Clustering Model

refreshTable

Invalidates and refreshes all the cached data and metadata of the given table

print.structType

Print a Spark StructType.

randomSplit

print.structField

Print a Spark StructField.

orderBy

Ordering Columns in a WindowSpec

(One-Sample) Kolmogorov-Smirnov Test

setJobGroup

Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.

setJobDescription

Set a human readable description of the current job.

spark.lapply

Run a function over a list of elements, distributing the computations with Spark

show

sampleBy

Returns a stratified sample without replacement

otherwise

saveAsTable

Save the contents of the SparkDataFrame to a data source as a table

spark.gbt

Gradient Boosted Tree Model for Regression and Classification

sparkR.init

(Deprecated) Initialize a new Spark Context

spark.getSparkFiles

Get the absolute path of a file added through spark.addFile.

spark.decisionTree

Decision Tree Model for Regression and Classification

Print Schema of a SparkDataFrame

Latent Dirichlet Allocation

spark.logit

Logistic Regression Model

sparkRHive.init

(Deprecated) Initialize a new HiveContext

sparkRSQL.init

(Deprecated) Initialize a new SQLContext

setLocalProperty

Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.

queryName

read.parquet

Create a SparkDataFrame from a Parquet file.

read.stream

Load a streaming SparkDataFrame

(Deprecated) Register Temporary Table

spark.naiveBayes

Naive Bayes Models

spark.mlp

Multilayer Perceptron Classification Model

Get version of Spark on which this application is running

spark.kmeans

K-Means Clustering Model

sparkR.uiWebUrl

Get the URL of the SparkUI instance for the current active SparkSession

spark.isoreg

Isotonic Regression Model

spark.randomForest

Random Forest Model for Regression and Classification

write.stream

Write the streaming SparkDataFrame to a data source.

write.text

Save the content of SparkDataFrame in a text file at the specified path.

Accelerated Failure Time (AFT) Survival Regression Model

union

Return a new SparkDataFrame containing the union of rows

storageLevel

StorageLevel

unionByName

Return a new SparkDataFrame containing the union of rows, matched by column names

str

Compactly display the structure of a dataset

take

Take the first NUM rows of a SparkDataFrame and return the results as a R data.frame

write.orc

Save the contents of SparkDataFrame as an ORC file, preserving the schema.

Save the contents of SparkDataFrame as a Parquet file, preserving the schema.

Evaluate a R expression in an environment constructed from a SparkDataFrame

spark.gaussianMixture

Multivariate Gaussian Mixture Model (GMM)

write.df

Save the contents of SparkDataFrame to a data source.

write.jdbc

Save the content of SparkDataFrame to an external database table via JDBC.

spark.getSparkFilesRootDirectory

Get the root directory that contains files added through spark.addFile.

spark.glm

Generalized Linear Models

Call Static Java Methods

tableNames

Table Names

sparkR.conf

Get Runtime Config from the current active SparkSession

Get the existing SparkSession or initialize a new SparkSession.

uncacheTable

Uncache Table

sparkR.session.stop

Stop the Spark Session and Spark Context

tableToDF

Create a SparkDataFrame from a SparkSQL table or view

Save the contents of SparkDataFrame as a JSON file

write.ml

Saves the MLlib model to the input path

GBTRegressionModel-class

S4 class that represents a GBTRegressionModel

FPGrowthModel-class

S4 class that represents a FPGrowthModel