Learn R Programming

⚠️There's a newer version (3.1.2) of this package.Take me there.

SparkR (version 2.4.6)

R Front End for 'Apache Spark'

Description

Provides an R Front end for 'Apache Spark' .

Copy Link

Version

Install

install.packages('SparkR')

Monthly Downloads

74

Version

2.4.6

License

Apache License (== 2.0)

Maintainer

Shivaram Venkataraman

Last Published

June 6th, 2020

Functions in SparkR (2.4.6)

ALSModel-class

S4 class that represents an ALSModel
DecisionTreeClassificationModel-class

S4 class that represents a DecisionTreeClassificationModel
AFTSurvivalRegressionModel-class

S4 class that represents a AFTSurvivalRegressionModel
GaussianMixtureModel-class

S4 class that represents a GaussianMixtureModel
BisectingKMeansModel-class

S4 class that represents a BisectingKMeansModel
GBTClassificationModel-class

S4 class that represents a GBTClassificationModel
DecisionTreeRegressionModel-class

S4 class that represents a DecisionTreeRegressionModel
alias

alias
IsotonicRegressionModel-class

S4 class that represents an IsotonicRegressionModel
GroupedData-class

S4 class that represents a GroupedData
LogisticRegressionModel-class

S4 class that represents an LogisticRegressionModel
NaiveBayesModel-class

S4 class that represents a NaiveBayesModel
LinearSVCModel-class

S4 class that represents an LinearSVCModel
broadcast

broadcast
GeneralizedLinearRegressionModel-class

S4 class that represents a generalized linear model
as.data.frame

Download data from a SparkDataFrame into a R data.frame
LDAModel-class

S4 class that represents an LDAModel
arrange

Arrange Rows by Variables
MultilayerPerceptronClassificationModel-class

S4 class that represents a MultilayerPerceptronClassificationModel
KMeansModel-class

S4 class that represents a KMeansModel
cache

Cache
KSTest-class

S4 class that represents an KSTest
cast

Casts the column to a different data type.
checkpoint

checkpoint
approxQuantile

Calculates the approximate quantiles of numerical columns of a SparkDataFrame
avg

avg
awaitTermination

awaitTermination
attach,SparkDataFrame-method

Attach SparkDataFrame to R search path
RandomForestRegressionModel-class

S4 class that represents a RandomForestRegressionModel
RandomForestClassificationModel-class

S4 class that represents a RandomForestClassificationModel
clearCache

Clear Cache
StreamingQuery-class

S4 class that represents a StreamingQuery
SparkDataFrame-class

S4 class that represents a SparkDataFrame
column_datetime_diff_functions

Date time arithmetic functions for Column operations
cancelJobGroup

Cancel active jobs for the specified group
collect

Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.
cacheTable

Cache Table
coalesce

Coalesce
clearJobGroup

Clear current job group ID and its description
column_aggregate_functions

Aggregate functions for Column operations
between

between
coltypes

coltypes
WindowSpec-class

S4 class that represents a WindowSpec
crosstab

Computes a pair-wise frequency table of the given columns
createDataFrame

Create a SparkDataFrame
column_datetime_functions

Date time functions for Column operations
createExternalTable

(Deprecated) Create an external table
column_collection_functions

Collection functions for Column operations
crossJoin

CrossJoin
column_math_functions

Math functions for Column operations
column

S4 class that represents a SparkDataFrame column
colnames

Column Names of SparkDataFrame
column_nonaggregate_functions

Non-aggregate functions for Column operations
column_string_functions

String functions for Column operations
corr

corr
exceptAll

exceptAll
freqItems

Finding frequent items for columns, possibly with false positives
gapply

gapply
except

except
column_window_functions

Window functions for Column operations
describe

describe
dim

Returns the dimensions of SparkDataFrame
asc

A set of operations working with SparkDataFrame columns
group_by

GroupBy
dapply

dapply
dapplyCollect

dapplyCollect
dropTempView

Drops the temporary view with the given view name in the catalog.
dtypes

DataTypes
cube

cube
createOrReplaceTempView

Creates a temporary view using the given name.
createTable

Creates a table based on the dataset in a data source
currentDatabase

Returns the current default database
column_misc_functions

Miscellaneous functions for Column operations
hashCode

Compute the hashCode of an object
isLocal

isLocal
isStreaming

isStreaming
cov

cov
count

Count
nrow

Returns the number of rows in a SparkDataFrame
not

!
%<=>%

%<=>%
endsWith

endsWith
gapplyCollect

gapplyCollect
dropTempTable

(Deprecated) Drop Temporary Table
dropDuplicates

dropDuplicates
getLocalProperty

Get a local property set in this thread, or NULL if it is missing. See setLocalProperty.
getNumPartitions

getNumPartitions
glm,formula,ANY,SparkDataFrame-method

Generalized Linear Models (R-compliant)
insertInto

insertInto
histogram

Compute histogram statistics for given column
distinct

Distinct
drop

drop
head

Head
hint

hint
explain

Explain
persist

Persist
filter

Filter
listColumns

Returns a list of columns for the given table/view in the specified database
fitted

Get fitted result from a k-means model
first

Return the first row of a SparkDataFrame
merge

Merges two data frames
mutate

Mutate
read.orc

Create a SparkDataFrame from an ORC file.
pivot

Pivot a column of the GroupedData and perform the specified aggregation.
read.ml

Load a fitted MLlib model from the input path.
join

Join
listDatabases

Returns a list of databases available
listFunctions

Returns a list of functions registered in the specified database
install.spark

Download and Install Apache Spark to a Local Directory
intersect

Intersect
listTables

Returns a list of tables or views in the specified database
repartition

Repartition
repartitionByRange

Repartition by range
isActive

isActive
intersectAll

intersectAll
read.text

Create a SparkDataFrame from a text file.
recoverPartitions

Recovers all the partitions in the directory of a table and update the catalog
lastProgress

lastProgress
last

last
limit

Limit
setCheckpointDir

Set checkpoint directory
rbind

Union two or more SparkDataFrames
over

over
read.df

Load a SparkDataFrame
partitionBy

partitionBy
refreshByPath

Invalidates and refreshes all the cached data and metadata for SparkDataFrame containing path
setCurrentDatabase

Sets the current default database
localCheckpoint

localCheckpoint
predict

Makes predictions from a MLlib model
print.jobj

Print a JVM object reference.
%in%

Match a column with given values.
read.jdbc

Create a SparkDataFrame representing the database table accessible via JDBC URL
read.json

Create a SparkDataFrame from a JSON file.
spark.als

Alternating Least Squares (ALS) for Collaborative Filtering
spark.addFile

Add a file or directory to be downloaded with this Spark job on every node.
dropna

A set of SparkDataFrame functions working with NA values
rollup

rollup
ncol

Returns the number of columns in a SparkDataFrame
rowsBetween

rowsBetween
spark.bisectingKmeans

Bisecting K-Means Clustering Model
refreshTable

Invalidates and refreshes all the cached data and metadata of the given table
print.structType

Print a Spark StructType.
randomSplit

randomSplit
print.structField

Print a Spark StructField.
orderBy

Ordering Columns in a WindowSpec
sample

Sample
rangeBetween

rangeBetween
spark.kstest

(One-Sample) Kolmogorov-Smirnov Test
setJobGroup

Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobDescription

Set a human readable description of the current job.
spark.lapply

Run a function over a list of elements, distributing the computations with Spark
show

show
sampleBy

Returns a stratified sample without replacement
otherwise

otherwise
saveAsTable

Save the contents of the SparkDataFrame to a data source as a table
spark.gbt

Gradient Boosted Tree Model for Regression and Classification
sparkR.init

(Deprecated) Initialize a new Spark Context
spark.getSparkFiles

Get the absolute path of a file added through spark.addFile.
spark.decisionTree

Decision Tree Model for Regression and Classification
spark.svmLinear

Linear SVM Model
sparkR.callJMethod

Call Java Methods
sparkR.newJObject

Create Java Objects
printSchema

Print Schema of a SparkDataFrame
schema

Get schema object
showDF

showDF
spark.lda

Latent Dirichlet Allocation
spark.logit

Logistic Regression Model
sparkRHive.init

(Deprecated) Initialize a new HiveContext
sparkRSQL.init

(Deprecated) Initialize a new SQLContext
setLocalProperty

Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
queryName

queryName
read.parquet

Create a SparkDataFrame from a Parquet file.
read.stream

Load a streaming SparkDataFrame
setLogLevel

Set new log level
status

status
registerTempTable

(Deprecated) Register Temporary Table
spark.naiveBayes

Naive Bayes Models
spark.mlp

Multilayer Perceptron Classification Model
stopQuery

stopQuery
substr

substr
subset

Subset
sql

SQL Query
structField

structField
startsWith

startsWith
structType

structType
sparkR.version

Get version of Spark on which this application is running
spark.kmeans

K-Means Clustering Model
sparkR.uiWebUrl

Get the URL of the SparkUI instance for the current active SparkSession
spark.isoreg

Isotonic Regression Model
spark.randomForest

Random Forest Model for Regression and Classification
write.stream

Write the streaming SparkDataFrame to a data source.
write.text

Save the content of SparkDataFrame in a text file at the specified path.
tables

Tables
rename

rename
spark.survreg

Accelerated Failure Time (AFT) Survival Regression Model
union

Return a new SparkDataFrame containing the union of rows
storageLevel

StorageLevel
unionByName

Return a new SparkDataFrame containing the union of rows, matched by column names
str

Compactly display the structure of a dataset
take

Take the first NUM rows of a SparkDataFrame and return the results as a R data.frame
write.orc

Save the contents of SparkDataFrame as an ORC file, preserving the schema.
select

Select
selectExpr

SelectExpr
write.parquet

Save the contents of SparkDataFrame as a Parquet file, preserving the schema.
windowPartitionBy

windowPartitionBy
spark.fpGrowth

FP-growth
with

Evaluate a R expression in an environment constructed from a SparkDataFrame
spark.gaussianMixture

Multivariate Gaussian Mixture Model (GMM)
write.df

Save the contents of SparkDataFrame to a data source.
write.jdbc

Save the content of SparkDataFrame to an external database table via JDBC.
spark.getSparkFilesRootDirectory

Get the root directory that contains files added through spark.addFile.
spark.glm

Generalized Linear Models
withWatermark

withWatermark
withColumn

WithColumn
sparkR.callJStatic

Call Static Java Methods
tableNames

Table Names
sparkR.conf

Get Runtime Config from the current active SparkSession
summary

summary
agg

summarize
toJSON

toJSON
sparkR.session

Get the existing SparkSession or initialize a new SparkSession.
uncacheTable

Uncache Table
sparkR.session.stop

Stop the Spark Session and Spark Context
tableToDF

Create a SparkDataFrame from a SparkSQL table or view
unpersist

Unpersist
windowOrderBy

windowOrderBy
write.json

Save the contents of SparkDataFrame as a JSON file
write.ml

Saves the MLlib model to the input path
GBTRegressionModel-class

S4 class that represents a GBTRegressionModel
FPGrowthModel-class

S4 class that represents a FPGrowthModel