⚠️There's a newer version (3.1.2) of this package.Take me there.

SparkR (version 2.1.2)

R Frontend for Apache Spark

Description

Provides an R Frontend for Apache Spark.

Copy Link

Version

Install

install.packages('SparkR')

Monthly Downloads

119

Version

2.1.2

License

Apache License (== 2.0)

Maintainer

Shivaram Venkataraman

Last Published

October 12th, 2017

Functions in SparkR (2.1.2)

AFTSurvivalRegressionModel-class

S4 class that represents a AFTSurvivalRegressionModel

ALSModel-class

S4 class that represents an ALSModel

GBTClassificationModel-class

S4 class that represents a GBTClassificationModel

GBTRegressionModel-class

S4 class that represents a GBTRegressionModel

KMeansModel-class

S4 class that represents a KMeansModel

KSTest-class

S4 class that represents an KSTest

GroupedData-class

S4 class that represents a GroupedData

IsotonicRegressionModel-class

S4 class that represents an IsotonicRegressionModel

GaussianMixtureModel-class

S4 class that represents a GaussianMixtureModel

GeneralizedLinearRegressionModel-class

S4 class that represents a generalized linear model

MultilayerPerceptronClassificationModel-class

S4 class that represents a MultilayerPerceptronClassificationModel

NaiveBayesModel-class

S4 class that represents a NaiveBayesModel

LDAModel-class

S4 class that represents an LDAModel

LogisticRegressionModel-class

S4 class that represents an LogisticRegressionModel

Download data from a SparkDataFrame into a R data.frame

Cancel active jobs for the specified group

arrange

Arrange Rows by Variables

RandomForestClassificationModel-class

S4 class that represents a RandomForestClassificationModel

RandomForestRegressionModel-class

S4 class that represents a RandomForestRegressionModel

Collects all the elements of a SparkDataFrame and coerces them into an R data.frame.

cast

Casts the column to a different data type.

S4 class that represents a SparkDataFrame column

asc

A set of operations working with SparkDataFrame columns

colnames

Column Names of SparkDataFrame

cume_dist

crosstab

Computes a pair-wise frequency table of the given columns

(Deprecated) Drop Temporary Table

factorial

filter

Filter

generateAliasesForIntersectedCols

Creates a list of columns by replacing the intersected ones with aliases

Clear current job group ID and its description

SparkDataFrame-class

S4 class that represents a SparkDataFrame

WindowSpec-class

S4 class that represents a WindowSpec

approxCountDistinct

Returns the approximate number of distinct items in a group

approxQuantile

Calculates the approximate quantiles of a numerical column of a SparkDataFrame

atan2

attach

Attach SparkDataFrame to R search path

cbrt

ceil

Computes the ceiling of the given value

Download and Install Apache Spark to a Local Directory

floor

format_number

glm

Generalized Linear Models (R-compliant)

Create an external table

createOrReplaceTempView

Creates a temporary view using the given name.

countDistinct

Count Distinct Values

Merges two data frames

dropna

A set of SparkDataFrame functions working with NA values

Print a Spark StructField.

print.structType

Print a Spark StructType.

Create a SparkDataFrame

date_format

date_sub

dropTempView

Drops the temporary view with the given view name in the catalog.

Assigns a group ID to all the jobs started by this thread until the group ID is set to a different value or cleared.

dim

Returns the dimensions of SparkDataFrame

encode

endsWith

first

Return the first row of a SparkDataFrame

fitted

Get fitted result from a k-means model

soundex

spark.addFile

Add a file or directory to be downloaded with this Spark job on every node.

spark.gbt

Gradient Boosted Tree Model for Regression and Classification

spark.getSparkFiles

Get the absolute path of a file added through spark.addFile.

spark.mlp

Multilayer Perceptron Classification Model

Compute the hashCode of an object

Match a column with given values.

format_string

freqItems

Finding frequent items for columns, possibly with false positives

(Deprecated) Initialize a new SQLContext

spark_partition_id

Return the partition ID as a column

Compute histogram statistics for given column

Returns the number of rows in a SparkDataFrame

monotonically_increasing_id

pivot

Pivot a column of the GroupedData and perform the specified aggregation.

rand

randn

printSchema

Print Schema of a SparkDataFrame

quarter

rank

rbind

Union two or more SparkDataFrames

reverse

read.orc

Create a SparkDataFrame from an ORC file.

read.parquet

Create a SparkDataFrame from a Parquet file.

Isotonic Regression Model

spark.kmeans

K-Means Clustering Model

spark.randomForest

Random Forest Model for Regression and Classification

Accelerated Failure Time (AFT) Survival Regression Model

sparkR.session.stop

Stop the Spark Session and Spark Context

sparkR.uiWebUrl

Get the URL of the SparkUI instance for the current active SparkSession

str

Compactly display the structure of a dataset

struct

size

spark.lda

Latent Dirichlet Allocation

spark.logit

Logistic Regression Model

sparkR.callJMethod

Call Java Methods

sparkR.callJStatic

Call Static Java Methods

Create a SparkDataFrame from a SparkSQL Table

Ordering Columns in a WindowSpec

pmod

posexplode

read.df

Load a SparkDataFrame

take

Take the first NUM rows of a SparkDataFrame and return the results as a R data.frame

Create a SparkDataFrame representing the database table accessible via JDBC URL

regexp_replace

registerTempTable

(Deprecated) Register Temporary Table

sampleBy

Returns a stratified sample without replacement

max

md5

ncol

Returns the number of columns in a SparkDataFrame

negate

partitionBy

saveAsTable

Save the contents of the SparkDataFrame to a data source as a table

Save the contents of SparkDataFrame as an ORC file, preserving the schema.

write.parquet

Save the contents of SparkDataFrame as a Parquet file, preserving the schema.

percent_rank

predict

Makes predictions from a MLlib model

print.jobj

Print a JVM object reference.

spark.als

Alternating Least Squares (ALS) for Collaborative Filtering

spark.gaussianMixture

Multivariate Gaussian Mixture Model (GMM)

spark.getSparkFilesRootDirectory

Get the root directory that contains files added through spark.addFile.

spark.glm

Generalized Linear Models

sparkR.conf

Get Runtime Config from the current active SparkSession

sparkR.init

(Deprecated) Initialize a new Spark Context

sparkR.version

Get version of Spark on which this application is running

sparkRHive.init

(Deprecated) Initialize a new HiveContext

Save the contents of SparkDataFrame to a data source.

write.jdbc

Save the content of SparkDataFrame to an external database table via JDBC.

Create a SparkDataFrame from a JSON file.

read.ml

Load a fitted MLlib model from the input path.

read.text

Create a SparkDataFrame from a text file.

Save the content of SparkDataFrame in a text file at the specified path.

(One-Sample) Kolmogorov-Smirnov Test

spark.lapply

Run a function over a list of elements, distributing the computations with Spark

sparkR.newJObject

Create Java Objects

sparkR.session

Get the existing SparkSession or initialize a new SparkSession.

Evaluate a R expression in an environment constructed from a SparkDataFrame

withColumn

WithColumn

union

Return a new SparkDataFrame containing the union of rows

Save the contents of SparkDataFrame as a JSON file

write.ml

Saves the MLlib model to the input path