Rdocumentation
powered by
Learn R Programming
⚠️
There's a newer version (3.1.2) of this package.
Take me there.
SparkR (version 2.4.6)
R Front End for 'Apache Spark'
Description
Provides an R Front end for 'Apache Spark'
.
Copy Link
Link to current version
Version
Version
3.1.2
2.4.6
2.4.5
2.4.4
2.4.3
2.4.2
2.4.1
2.3.0
2.1.2
Install
install.packages('SparkR')
Monthly Downloads
74
Version
2.4.6
License
Apache License (== 2.0)
Maintainer
Shivaram Venkataraman
Last Published
June 6th, 2020
Functions in SparkR (2.4.6)
Search all functions
ALSModel-class
S4 class that represents an ALSModel
DecisionTreeClassificationModel-class
S4 class that represents a DecisionTreeClassificationModel
AFTSurvivalRegressionModel-class
S4 class that represents a AFTSurvivalRegressionModel
GaussianMixtureModel-class
S4 class that represents a GaussianMixtureModel
BisectingKMeansModel-class
S4 class that represents a BisectingKMeansModel
GBTClassificationModel-class
S4 class that represents a GBTClassificationModel
DecisionTreeRegressionModel-class
S4 class that represents a DecisionTreeRegressionModel
alias
alias
IsotonicRegressionModel-class
S4 class that represents an IsotonicRegressionModel
GroupedData-class
S4 class that represents a GroupedData
LogisticRegressionModel-class
S4 class that represents an LogisticRegressionModel
NaiveBayesModel-class
S4 class that represents a NaiveBayesModel
LinearSVCModel-class
S4 class that represents an LinearSVCModel
broadcast
broadcast
GeneralizedLinearRegressionModel-class
S4 class that represents a generalized linear model
as.data.frame
Download data from a SparkDataFrame into a R data.frame
LDAModel-class
S4 class that represents an LDAModel
arrange
Arrange Rows by Variables
MultilayerPerceptronClassificationModel-class
S4 class that represents a MultilayerPerceptronClassificationModel
KMeansModel-class
S4 class that represents a KMeansModel
cache
Cache
KSTest-class
S4 class that represents an KSTest
cast
Casts the column to a different data type.
checkpoint
checkpoint
approxQuantile
Calculates the approximate quantiles of numerical columns of a SparkDataFrame
avg
avg
awaitTermination
awaitTermination
attach,SparkDataFrame-method
Attach SparkDataFrame to R search path
RandomForestRegressionModel-class
S4 class that represents a RandomForestRegressionModel
RandomForestClassificationModel-class
S4 class that represents a RandomForestClassificationModel
clearCache
Clear Cache
StreamingQuery-class
S4 class that represents a StreamingQuery
SparkDataFrame-class
S4 class that represents a SparkDataFrame
column_datetime_diff_functions
Date time arithmetic functions for Column operations
cancelJobGroup
Cancel active jobs for the specified group
collect
Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.
cacheTable
Cache Table
coalesce
Coalesce
clearJobGroup
Clear current job group ID and its description
column_aggregate_functions
Aggregate functions for Column operations
between
between
coltypes
coltypes
WindowSpec-class
S4 class that represents a WindowSpec
crosstab
Computes a pair-wise frequency table of the given columns
createDataFrame
Create a SparkDataFrame
column_datetime_functions
Date time functions for Column operations
createExternalTable
(Deprecated) Create an external table
column_collection_functions
Collection functions for Column operations
crossJoin
CrossJoin
column_math_functions
Math functions for Column operations
column
S4 class that represents a SparkDataFrame column
colnames
Column Names of SparkDataFrame
column_nonaggregate_functions
Non-aggregate functions for Column operations
column_string_functions
String functions for Column operations
corr
corr
exceptAll
exceptAll
freqItems
Finding frequent items for columns, possibly with false positives
gapply
gapply
except
except
column_window_functions
Window functions for Column operations
describe
describe
dim
Returns the dimensions of SparkDataFrame
asc
A set of operations working with SparkDataFrame columns
group_by
GroupBy
dapply
dapply
dapplyCollect
dapplyCollect
dropTempView
Drops the temporary view with the given view name in the catalog.
dtypes
DataTypes
cube
cube
createOrReplaceTempView
Creates a temporary view using the given name.
createTable
Creates a table based on the dataset in a data source
currentDatabase
Returns the current default database
column_misc_functions
Miscellaneous functions for Column operations
hashCode
Compute the hashCode of an object
isLocal
isLocal
isStreaming
isStreaming
cov
cov
count
Count
nrow
Returns the number of rows in a SparkDataFrame
not
!
%<=>%
%<=>%
endsWith
endsWith
gapplyCollect
gapplyCollect
dropTempTable
(Deprecated) Drop Temporary Table
dropDuplicates
dropDuplicates
getLocalProperty
Get a local property set in this thread, or
NULL
if it is missing. See
setLocalProperty
.
getNumPartitions
getNumPartitions
glm,formula,ANY,SparkDataFrame-method
Generalized Linear Models (R-compliant)
insertInto
insertInto
histogram
Compute histogram statistics for given column
distinct
Distinct
drop
drop
head
Head
hint
hint
explain
Explain
persist
Persist
filter
Filter
listColumns
Returns a list of columns for the given table/view in the specified database
fitted
Get fitted result from a k-means model
first
Return the first row of a SparkDataFrame
merge
Merges two data frames
mutate
Mutate
read.orc
Create a SparkDataFrame from an ORC file.
pivot
Pivot a column of the GroupedData and perform the specified aggregation.
read.ml
Load a fitted MLlib model from the input path.
join
Join
listDatabases
Returns a list of databases available
listFunctions
Returns a list of functions registered in the specified database
install.spark
Download and Install Apache Spark to a Local Directory
intersect
Intersect
listTables
Returns a list of tables or views in the specified database
repartition
Repartition
repartitionByRange
Repartition by range
isActive
isActive
intersectAll
intersectAll
read.text
Create a SparkDataFrame from a text file.
recoverPartitions
Recovers all the partitions in the directory of a table and update the catalog
lastProgress
lastProgress
last
last
limit
Limit
setCheckpointDir
Set checkpoint directory
rbind
Union two or more SparkDataFrames
over
over
read.df
Load a SparkDataFrame
partitionBy
partitionBy
refreshByPath
Invalidates and refreshes all the cached data and metadata for SparkDataFrame containing path
setCurrentDatabase
Sets the current default database
localCheckpoint
localCheckpoint
predict
Makes predictions from a MLlib model
print.jobj
Print a JVM object reference.
%in%
Match a column with given values.
read.jdbc
Create a SparkDataFrame representing the database table accessible via JDBC URL
read.json
Create a SparkDataFrame from a JSON file.
spark.als
Alternating Least Squares (ALS) for Collaborative Filtering
spark.addFile
Add a file or directory to be downloaded with this Spark job on every node.
dropna
A set of SparkDataFrame functions working with NA values
rollup
rollup
ncol
Returns the number of columns in a SparkDataFrame
rowsBetween
rowsBetween
spark.bisectingKmeans
Bisecting K-Means Clustering Model
refreshTable
Invalidates and refreshes all the cached data and metadata of the given table
print.structType
Print a Spark StructType.
randomSplit
randomSplit
print.structField
Print a Spark StructField.
orderBy
Ordering Columns in a WindowSpec
sample
Sample
rangeBetween
rangeBetween
spark.kstest
(One-Sample) Kolmogorov-Smirnov Test
setJobGroup
Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.
setJobDescription
Set a human readable description of the current job.
spark.lapply
Run a function over a list of elements, distributing the computations with Spark
show
show
sampleBy
Returns a stratified sample without replacement
otherwise
otherwise
saveAsTable
Save the contents of the SparkDataFrame to a data source as a table
spark.gbt
Gradient Boosted Tree Model for Regression and Classification
sparkR.init
(Deprecated) Initialize a new Spark Context
spark.getSparkFiles
Get the absolute path of a file added through spark.addFile.
spark.decisionTree
Decision Tree Model for Regression and Classification
spark.svmLinear
Linear SVM Model
sparkR.callJMethod
Call Java Methods
sparkR.newJObject
Create Java Objects
printSchema
Print Schema of a SparkDataFrame
schema
Get schema object
showDF
showDF
spark.lda
Latent Dirichlet Allocation
spark.logit
Logistic Regression Model
sparkRHive.init
(Deprecated) Initialize a new HiveContext
sparkRSQL.init
(Deprecated) Initialize a new SQLContext
setLocalProperty
Set a local property that affects jobs submitted from this thread, such as the Spark fair scheduler pool.
queryName
queryName
read.parquet
Create a SparkDataFrame from a Parquet file.
read.stream
Load a streaming SparkDataFrame
setLogLevel
Set new log level
status
status
registerTempTable
(Deprecated) Register Temporary Table
spark.naiveBayes
Naive Bayes Models
spark.mlp
Multilayer Perceptron Classification Model
stopQuery
stopQuery
substr
substr
subset
Subset
sql
SQL Query
structField
structField
startsWith
startsWith
structType
structType
sparkR.version
Get version of Spark on which this application is running
spark.kmeans
K-Means Clustering Model
sparkR.uiWebUrl
Get the URL of the SparkUI instance for the current active SparkSession
spark.isoreg
Isotonic Regression Model
spark.randomForest
Random Forest Model for Regression and Classification
write.stream
Write the streaming SparkDataFrame to a data source.
write.text
Save the content of SparkDataFrame in a text file at the specified path.
tables
Tables
rename
rename
spark.survreg
Accelerated Failure Time (AFT) Survival Regression Model
union
Return a new SparkDataFrame containing the union of rows
storageLevel
StorageLevel
unionByName
Return a new SparkDataFrame containing the union of rows, matched by column names
str
Compactly display the structure of a dataset
take
Take the first NUM rows of a SparkDataFrame and return the results as a R data.frame
write.orc
Save the contents of SparkDataFrame as an ORC file, preserving the schema.
select
Select
selectExpr
SelectExpr
write.parquet
Save the contents of SparkDataFrame as a Parquet file, preserving the schema.
windowPartitionBy
windowPartitionBy
spark.fpGrowth
FP-growth
with
Evaluate a R expression in an environment constructed from a SparkDataFrame
spark.gaussianMixture
Multivariate Gaussian Mixture Model (GMM)
write.df
Save the contents of SparkDataFrame to a data source.
write.jdbc
Save the content of SparkDataFrame to an external database table via JDBC.
spark.getSparkFilesRootDirectory
Get the root directory that contains files added through spark.addFile.
spark.glm
Generalized Linear Models
withWatermark
withWatermark
withColumn
WithColumn
sparkR.callJStatic
Call Static Java Methods
tableNames
Table Names
sparkR.conf
Get Runtime Config from the current active SparkSession
summary
summary
agg
summarize
toJSON
toJSON
sparkR.session
Get the existing SparkSession or initialize a new SparkSession.
uncacheTable
Uncache Table
sparkR.session.stop
Stop the Spark Session and Spark Context
tableToDF
Create a SparkDataFrame from a SparkSQL table or view
unpersist
Unpersist
windowOrderBy
windowOrderBy
write.json
Save the contents of SparkDataFrame as a JSON file
write.ml
Saves the MLlib model to the input path
GBTRegressionModel-class
S4 class that represents a GBTRegressionModel
FPGrowthModel-class
S4 class that represents a FPGrowthModel