⚠️There's a newer version (3.1.2) of this package.Take me there.

SparkR (version 2.4.5)

R Front End for 'Apache Spark'

Description

Provides an R Front end for 'Apache Spark' .

Copy Link

Version

Install

install.packages('SparkR')

Monthly Downloads

104

Version

2.4.5

License

Apache License (== 2.0)

Maintainer

Shivaram Venkataraman

Last Published

February 7th, 2020

Functions in SparkR (2.4.5)

DecisionTreeClassificationModel-class

S4 class that represents a DecisionTreeClassificationModel

GBTRegressionModel-class

S4 class that represents a GBTRegressionModel

FPGrowthModel-class

S4 class that represents a FPGrowthModel

BisectingKMeansModel-class

S4 class that represents a BisectingKMeansModel

ALSModel-class

S4 class that represents an ALSModel

GBTClassificationModel-class

S4 class that represents a GBTClassificationModel

GaussianMixtureModel-class

S4 class that represents a GaussianMixtureModel

AFTSurvivalRegressionModel-class

S4 class that represents a AFTSurvivalRegressionModel

DecisionTreeRegressionModel-class

S4 class that represents a DecisionTreeRegressionModel

KMeansModel-class

S4 class that represents a KMeansModel

KSTest-class

S4 class that represents an KSTest

RandomForestRegressionModel-class

S4 class that represents a RandomForestRegressionModel

RandomForestClassificationModel-class

S4 class that represents a RandomForestClassificationModel

NaiveBayesModel-class

S4 class that represents a NaiveBayesModel

clearCache

Clear Cache

clearJobGroup

Clear current job group ID and its description

WindowSpec-class

S4 class that represents a WindowSpec

StreamingQuery-class

S4 class that represents a StreamingQuery

LDAModel-class

S4 class that represents an LDAModel

GeneralizedLinearRegressionModel-class

S4 class that represents a generalized linear model

LinearSVCModel-class

S4 class that represents an LinearSVCModel

LogisticRegressionModel-class

S4 class that represents an LogisticRegressionModel

arrange

Arrange Rows by Variables

as.data.frame

Download data from a SparkDataFrame into a R data.frame

MultilayerPerceptronClassificationModel-class

S4 class that represents a MultilayerPerceptronClassificationModel

GroupedData-class

S4 class that represents a GroupedData

Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.

column_nonaggregate_functions

Non-aggregate functions for Column operations

coalesce

Coalesce

column_datetime_diff_functions

Date time arithmetic functions for Column operations

cast

Casts the column to a different data type.

SparkDataFrame-class

S4 class that represents a SparkDataFrame

column_datetime_functions

Date time functions for Column operations

checkpoint

column

S4 class that represents a SparkDataFrame column

column_string_functions

String functions for Column operations

colnames

Column Names of SparkDataFrame

column_window_functions

Window functions for Column operations

corr

asc

A set of operations working with SparkDataFrame columns

glm,formula,ANY,SparkDataFrame-method

Generalized Linear Models (R-compliant)

crosstab

Computes a pair-wise frequency table of the given columns

IsotonicRegressionModel-class

S4 class that represents an IsotonicRegressionModel

attach,SparkDataFrame-method

Attach SparkDataFrame to R search path

Print a Spark StructType.

merge

Merges two data frames

last

print.structField

Print a Spark StructField.

cacheTable

Cache Table

currentDatabase

Returns the current default database

Download and Install Apache Spark to a Local Directory

dropTempTable

(Deprecated) Drop Temporary Table

Cancel active jobs for the specified group

lastProgress

column_aggregate_functions

Aggregate functions for Column operations

rangeBetween

broadcast

approxQuantile

Calculates the approximate quantiles of numerical columns of a SparkDataFrame

dtypes

DataTypes

dropTempView

Drops the temporary view with the given view name in the catalog.

Get a local property set in this thread, or NULL if it is missing. See setLocalProperty.

Returns the number of rows in a SparkDataFrame

repartition

Repartition

column_math_functions

Math functions for Column operations

count

Count

column_misc_functions

Miscellaneous functions for Column operations

createOrReplaceTempView

Creates a temporary view using the given name.

cov

column_collection_functions

Collection functions for Column operations

partitionBy

over

createDataFrame

Create a SparkDataFrame

setCheckpointDir

Set checkpoint directory

repartitionByRange

Repartition by range

createExternalTable

(Deprecated) Create an external table

dim

Returns the dimensions of SparkDataFrame

describe

endsWith

createTable

Creates a table based on the dataset in a data source

first

Return the first row of a SparkDataFrame

%<=>%

rollup

read.orc

Create a SparkDataFrame from an ORC file.

read.ml

Load a fitted MLlib model from the input path.

except

fitted

Get fitted result from a k-means model

saveAsTable

Save the contents of the SparkDataFrame to a data source as a table

Compute histogram statistics for given column

insertInto

exceptAll

freqItems

Finding frequent items for columns, possibly with false positives

gapply

setLocalProperty

Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.

setCurrentDatabase

Sets the current default database

head

Head

listColumns

Returns a list of columns for the given table/view in the specified database

hint

listDatabases

Returns a list of databases available

hashCode

Compute the hashCode of an object

setLogLevel

Set new log level

spark.kstest

(One-Sample) Kolmogorov-Smirnov Test

listFunctions

Returns a list of functions registered in the specified database

read.jdbc

Create a SparkDataFrame representing the database table accessible via JDBC URL

%in%

Match a column with given values.

refreshByPath

Invalidates and refreshes all the cached data and metadata for SparkDataFrame containing path

localCheckpoint

print.jobj

Print a JVM object reference.

read.json

Create a SparkDataFrame from a JSON file.

predict

Makes predictions from a MLlib model

listTables

Returns a list of tables or views in the specified database

dropna

A set of SparkDataFrame functions working with NA values

refreshTable

Invalidates and refreshes all the cached data and metadata of the given table

setJobGroup

Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.

showDF

spark.isoreg

Isotonic Regression Model

setJobDescription

Set a human readable description of the current job.

show

isLocal

ncol

Returns the number of columns in a SparkDataFrame

spark.kmeans

K-Means Clustering Model

persist

Persist

orderBy

Ordering Columns in a WindowSpec

isStreaming

rbind

Union two or more SparkDataFrames

pivot

Pivot a column of the GroupedData and perform the specified aggregation.

read.df

Load a SparkDataFrame

otherwise

spark.lapply

Run a function over a list of elements, distributing the computations with Spark

Print Schema of a SparkDataFrame

registerTempTable

(Deprecated) Register Temporary Table

read.parquet

Create a SparkDataFrame from a Parquet file.

read.stream

Load a streaming SparkDataFrame

sparkR.init

(Deprecated) Initialize a new Spark Context

sparkR.callJMethod

Call Java Methods

sparkR.uiWebUrl

Get the URL of the SparkUI instance for the current active SparkSession

Create a SparkDataFrame from a text file.

selectExpr

SelectExpr

sampleBy

Returns a stratified sample without replacement

spark.bisectingKmeans

Bisecting K-Means Clustering Model

recoverPartitions

Recovers all the partitions in the directory of a table and update the catalog

sample

Sample

sparkR.version

Get version of Spark on which this application is running

spark.decisionTree

Decision Tree Model for Regression and Classification

spark.addFile

Add a file or directory to be downloaded with this Spark job on every node.

spark.lda

Latent Dirichlet Allocation

spark.als

Alternating Least Squares (ALS) for Collaborative Filtering

spark.logit

Logistic Regression Model

substr

with

Evaluate a R expression in an environment constructed from a SparkDataFrame

(Deprecated) Initialize a new HiveContext

spark.glm

Generalized Linear Models

spark.getSparkFilesRootDirectory

Get the root directory that contains files added through spark.addFile.

spark.mlp

Multilayer Perceptron Classification Model

spark.naiveBayes

Naive Bayes Models

spark.randomForest

Random Forest Model for Regression and Classification

spark.survreg

Accelerated Failure Time (AFT) Survival Regression Model

stopQuery

status

sparkR.session

Get the existing SparkSession or initialize a new SparkSession.

sparkRSQL.init

(Deprecated) Initialize a new SQLContext

spark.fpGrowth

FP-growth

sparkR.session.stop

Stop the Spark Session and Spark Context

Compactly display the structure of a dataset

unionByName

Return a new SparkDataFrame containing the union of rows, matched by column names

write.ml

Saves the MLlib model to the input path

write.json

Save the contents of SparkDataFrame as a JSON file

union

Return a new SparkDataFrame containing the union of rows

withWatermark

structField

write.orc

Save the contents of SparkDataFrame as an ORC file, preserving the schema.

structType

take

Take the first NUM rows of a SparkDataFrame and return the results as a R data.frame

tables

Tables

write.parquet

Save the contents of SparkDataFrame as a Parquet file, preserving the schema.

tableToDF

Create a SparkDataFrame from a SparkSQL table or view

spark.gaussianMixture

Multivariate Gaussian Mixture Model (GMM)

Gradient Boosted Tree Model for Regression and Classification

write.jdbc

Save the content of SparkDataFrame to an external database table via JDBC.

write.df

Save the contents of SparkDataFrame to a data source.

sparkR.callJStatic

Call Static Java Methods

spark.getSparkFiles

Get the absolute path of a file added through spark.addFile.

sql

SQL Query

sparkR.conf

Get Runtime Config from the current active SparkSession

Save the content of SparkDataFrame in a text file at the specified path.

write.stream

Write the streaming SparkDataFrame to a data source.