Rdocumentation
powered by
Learn R Programming
⚠️
There's a newer version (3.1.2) of this package.
Take me there.
SparkR (version 2.4.5)
R Front End for 'Apache Spark'
Description
Provides an R Front end for 'Apache Spark'
.
Copy Link
Link to current version
Version
Version
3.1.2
2.4.6
2.4.5
2.4.4
2.4.3
2.4.2
2.4.1
2.3.0
2.1.2
Install
install.packages('SparkR')
Monthly Downloads
104
Version
2.4.5
License
Apache License (== 2.0)
Maintainer
Shivaram Venkataraman
Last Published
February 7th, 2020
Functions in SparkR (2.4.5)
Search all functions
DecisionTreeClassificationModel-class
S4 class that represents a DecisionTreeClassificationModel
GBTRegressionModel-class
S4 class that represents a GBTRegressionModel
FPGrowthModel-class
S4 class that represents a FPGrowthModel
BisectingKMeansModel-class
S4 class that represents a BisectingKMeansModel
ALSModel-class
S4 class that represents an ALSModel
GBTClassificationModel-class
S4 class that represents a GBTClassificationModel
GaussianMixtureModel-class
S4 class that represents a GaussianMixtureModel
AFTSurvivalRegressionModel-class
S4 class that represents a AFTSurvivalRegressionModel
DecisionTreeRegressionModel-class
S4 class that represents a DecisionTreeRegressionModel
KMeansModel-class
S4 class that represents a KMeansModel
KSTest-class
S4 class that represents an KSTest
RandomForestRegressionModel-class
S4 class that represents a RandomForestRegressionModel
RandomForestClassificationModel-class
S4 class that represents a RandomForestClassificationModel
NaiveBayesModel-class
S4 class that represents a NaiveBayesModel
clearCache
Clear Cache
clearJobGroup
Clear current job group ID and its description
WindowSpec-class
S4 class that represents a WindowSpec
StreamingQuery-class
S4 class that represents a StreamingQuery
LDAModel-class
S4 class that represents an LDAModel
GeneralizedLinearRegressionModel-class
S4 class that represents a generalized linear model
LinearSVCModel-class
S4 class that represents an LinearSVCModel
LogisticRegressionModel-class
S4 class that represents an LogisticRegressionModel
arrange
Arrange Rows by Variables
as.data.frame
Download data from a SparkDataFrame into a R data.frame
MultilayerPerceptronClassificationModel-class
S4 class that represents a MultilayerPerceptronClassificationModel
GroupedData-class
S4 class that represents a GroupedData
awaitTermination
awaitTermination
between
between
coltypes
coltypes
collect
Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.
column_nonaggregate_functions
Non-aggregate functions for Column operations
coalesce
Coalesce
column_datetime_diff_functions
Date time arithmetic functions for Column operations
cast
Casts the column to a different data type.
SparkDataFrame-class
S4 class that represents a SparkDataFrame
column_datetime_functions
Date time functions for Column operations
checkpoint
checkpoint
column
S4 class that represents a SparkDataFrame column
column_string_functions
String functions for Column operations
colnames
Column Names of SparkDataFrame
column_window_functions
Window functions for Column operations
corr
corr
asc
A set of operations working with SparkDataFrame columns
glm,formula,ANY,SparkDataFrame-method
Generalized Linear Models (R-compliant)
crosstab
Computes a pair-wise frequency table of the given columns
drop
drop
crossJoin
CrossJoin
distinct
Distinct
getNumPartitions
getNumPartitions
avg
avg
IsotonicRegressionModel-class
S4 class that represents an IsotonicRegressionModel
attach,SparkDataFrame-method
Attach SparkDataFrame to R search path
dapply
dapply
intersectAll
intersectAll
isActive
isActive
mutate
Mutate
join
Join
print.structType
Print a Spark StructType.
merge
Merges two data frames
last
last
print.structField
Print a Spark StructField.
cacheTable
Cache Table
currentDatabase
Returns the current default database
cube
cube
randomSplit
randomSplit
alias
alias
explain
Explain
dropDuplicates
dropDuplicates
install.spark
Download and Install Apache Spark to a Local Directory
dropTempTable
(Deprecated) Drop Temporary Table
dapplyCollect
dapplyCollect
intersect
Intersect
filter
Filter
cancelJobGroup
Cancel active jobs for the specified group
lastProgress
lastProgress
column_aggregate_functions
Aggregate functions for Column operations
rangeBetween
rangeBetween
broadcast
broadcast
approxQuantile
Calculates the approximate quantiles of numerical columns of a SparkDataFrame
dtypes
DataTypes
dropTempView
Drops the temporary view with the given view name in the catalog.
gapplyCollect
gapplyCollect
group_by
GroupBy
getLocalProperty
Get a local property set in this thread, or
NULL
if it is missing. See
setLocalProperty
.
cache
Cache
limit
Limit
not
!
nrow
Returns the number of rows in a SparkDataFrame
repartition
Repartition
column_math_functions
Math functions for Column operations
count
Count
column_misc_functions
Miscellaneous functions for Column operations
createOrReplaceTempView
Creates a temporary view using the given name.
cov
cov
column_collection_functions
Collection functions for Column operations
partitionBy
partitionBy
over
over
createDataFrame
Create a SparkDataFrame
setCheckpointDir
Set checkpoint directory
repartitionByRange
Repartition by range
createExternalTable
(Deprecated) Create an external table
dim
Returns the dimensions of SparkDataFrame
describe
describe
endsWith
endsWith
createTable
Creates a table based on the dataset in a data source
first
Return the first row of a SparkDataFrame
%<=>%
%<=>%
rollup
rollup
read.orc
Create a SparkDataFrame from an ORC file.
read.ml
Load a fitted MLlib model from the input path.
except
except
fitted
Get fitted result from a k-means model
saveAsTable
Save the contents of the SparkDataFrame to a data source as a table
schema
Get schema object
rowsBetween
rowsBetween
histogram
Compute histogram statistics for given column
insertInto
insertInto
exceptAll
exceptAll
freqItems
Finding frequent items for columns, possibly with false positives
gapply
gapply
setLocalProperty
Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
setCurrentDatabase
Sets the current default database
head
Head
listColumns
Returns a list of columns for the given table/view in the specified database
hint
hint
listDatabases
Returns a list of databases available
hashCode
Compute the hashCode of an object
setLogLevel
Set new log level
spark.kstest
(One-Sample) Kolmogorov-Smirnov Test
listFunctions
Returns a list of functions registered in the specified database
read.jdbc
Create a SparkDataFrame representing the database table accessible via JDBC URL
%in%
Match a column with given values.
refreshByPath
Invalidates and refreshes all the cached data and metadata for SparkDataFrame containing path
localCheckpoint
localCheckpoint
print.jobj
Print a JVM object reference.
read.json
Create a SparkDataFrame from a JSON file.
predict
Makes predictions from a MLlib model
listTables
Returns a list of tables or views in the specified database
dropna
A set of SparkDataFrame functions working with NA values
refreshTable
Invalidates and refreshes all the cached data and metadata of the given table
setJobGroup
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
showDF
showDF
spark.isoreg
Isotonic Regression Model
setJobDescription
Set a human readable description of the current job.
show
show
isLocal
isLocal
ncol
Returns the number of columns in a SparkDataFrame
spark.kmeans
K-Means Clustering Model
persist
Persist
orderBy
Ordering Columns in a WindowSpec
isStreaming
isStreaming
rbind
Union two or more SparkDataFrames
pivot
Pivot a column of the GroupedData and perform the specified aggregation.
read.df
Load a SparkDataFrame
otherwise
otherwise
spark.lapply
Run a function over a list of elements, distributing the computations with Spark
spark.svmLinear
Linear SVM Model
queryName
queryName
printSchema
Print Schema of a SparkDataFrame
registerTempTable
(Deprecated) Register Temporary Table
read.parquet
Create a SparkDataFrame from a Parquet file.
read.stream
Load a streaming SparkDataFrame
sparkR.init
(Deprecated) Initialize a new Spark Context
sparkR.callJMethod
Call Java Methods
sparkR.uiWebUrl
Get the URL of the SparkUI instance for the current active SparkSession
select
Select
rename
rename
read.text
Create a SparkDataFrame from a text file.
selectExpr
SelectExpr
sampleBy
Returns a stratified sample without replacement
spark.bisectingKmeans
Bisecting K-Means Clustering Model
recoverPartitions
Recovers all the partitions in the directory of a table and update the catalog
sample
Sample
sparkR.version
Get version of Spark on which this application is running
spark.decisionTree
Decision Tree Model for Regression and Classification
spark.addFile
Add a file or directory to be downloaded with this Spark job on every node.
spark.lda
Latent Dirichlet Allocation
spark.als
Alternating Least Squares (ALS) for Collaborative Filtering
spark.logit
Logistic Regression Model
substr
substr
with
Evaluate a R expression in an environment constructed from a SparkDataFrame
subset
Subset
windowPartitionBy
windowPartitionBy
sparkR.newJObject
Create Java Objects
withColumn
WithColumn
sparkRHive.init
(Deprecated) Initialize a new HiveContext
spark.glm
Generalized Linear Models
spark.getSparkFilesRootDirectory
Get the root directory that contains files added through spark.addFile.
spark.mlp
Multilayer Perceptron Classification Model
spark.naiveBayes
Naive Bayes Models
spark.randomForest
Random Forest Model for Regression and Classification
spark.survreg
Accelerated Failure Time (AFT) Survival Regression Model
stopQuery
stopQuery
status
status
sparkR.session
Get the existing SparkSession or initialize a new SparkSession.
sparkRSQL.init
(Deprecated) Initialize a new SQLContext
spark.fpGrowth
FP-growth
sparkR.session.stop
Stop the Spark Session and Spark Context
tableNames
Table Names
storageLevel
StorageLevel
str
Compactly display the structure of a dataset
unionByName
Return a new SparkDataFrame containing the union of rows, matched by column names
write.ml
Saves the MLlib model to the input path
write.json
Save the contents of SparkDataFrame as a JSON file
union
Return a new SparkDataFrame containing the union of rows
withWatermark
withWatermark
structField
structField
write.orc
Save the contents of SparkDataFrame as an ORC file, preserving the schema.
structType
structType
take
Take the first NUM rows of a SparkDataFrame and return the results as a R data.frame
tables
Tables
write.parquet
Save the contents of SparkDataFrame as a Parquet file, preserving the schema.
tableToDF
Create a SparkDataFrame from a SparkSQL table or view
spark.gaussianMixture
Multivariate Gaussian Mixture Model (GMM)
windowOrderBy
windowOrderBy
unpersist
Unpersist
spark.gbt
Gradient Boosted Tree Model for Regression and Classification
write.jdbc
Save the content of SparkDataFrame to an external database table via JDBC.
write.df
Save the contents of SparkDataFrame to a data source.
sparkR.callJStatic
Call Static Java Methods
spark.getSparkFiles
Get the absolute path of a file added through spark.addFile.
sql
SQL Query
sparkR.conf
Get Runtime Config from the current active SparkSession
startsWith
startsWith
agg
summarize
summary
summary
toJSON
toJSON
uncacheTable
Uncache Table
write.text
Save the content of SparkDataFrame in a text file at the specified path.
write.stream
Write the streaming SparkDataFrame to a data source.