Learn R Programming

UpSetR (version 1.0.2)

upset: UpSetR Plot

Description

Visualization of set intersections using novel UpSet matrix design.

Usage

upset(data, nsets = 5, nintersects = 40, sets = NULL,
  matrix.color = "gray23", main.bar.color = "gray23",
  sets.bar.color = "gray23", point.size = 4, line.size = 1,
  name.size = 10, mb.ratio = c(0.7, 0.3), expression = NULL,
  att.pos = NULL, att.color = main.bar.color, order.by = c("freq",
  "degree"), decreasing = c(T, F), show.numbers = "yes",
  number.angles = 0, group.by = "degree", cutoff = NULL, queries = NULL,
  query.legend = "none", shade.color = "gray88", shade.alpha = 0.25,
  empty.intersections = NULL, color.pal = 1, boxplot.summary = NULL,
  attribute.plots = NULL)

Arguments

data
Data set
nsets
Number of sets to look at
nintersects
Number of intersections to plot
sets
Specific sets to look at (Include as combinations. Ex: c("Name1", "Name2"))
matrix.color
Color of the intersection points
main.bar.color
Color of the main bar plot
sets.bar.color
Color of set size bar plot
point.size
Size of points in matrix plot
line.size
Width of lines in matrix plot
name.size
Size of set names in matrix plot
mb.ratio
Ratio between matrix plot and main bar plot (Keep in terms of hundreths)
expression
Expression to subset attributes of intersection or element query data. Enter as string (Ex: "ColName > 3")
att.pos
Position of attribute plot. If NULL or "bottom" the plot will be at below UpSet plot. If "top" it will be above UpSert plot
att.color
Color of attribute histogram bins or scatterplot points for unqueried data represented by main bars. Default set to color of main bars.
order.by
How the intersections in the matrix should be ordered by. Options include frequency (entered as "freq"), degree, or both in any order.
decreasing
How the variables in order.by should be ordered. "freq" is decreasing (greatest to least) and "degree" is increasing (least to greatest)
show.numbers
Show numbers of intersection sizes above bars
number.angles
The angle of the numbers atop the intersection size bars
group.by
How the data should be grouped ("degree" or "sets")
cutoff
The number of intersections from each set (to cut off at) when aggregating by sets
queries
Unified querie of intersections, elements, and custom row functions. Entered as a list that contains a list of queries. query is the type of query being conducted. params are the parameters of the query (if any). color is the color of the points on the pl
query.legend
Position query legend on top or bottom of UpSet plot
shade.color
Color of row shading in matrix
shade.alpha
Transparency of shading in matrix
empty.intersections
Additionally display empty sets up to nintersects
color.pal
Color palette for attribute plots
boxplot.summary
Boxplots representing the distribution of a selected attribute for each intersection. Select attributes by entering a character vector of attribute names (e.g. c("Name1", "Name2")). The maximum number of attributes that can be entered is 2.
attribute.plots
Create custom ggplot using intersection data represented in the main bar plot. Prior to adding custom plots, the UpSet plot is set up in a 100 by 100 grid. The attribute.plots parameter takes a list that contains the number of rows that should be allocate

Details

Visualization of set data in the layout described by Lex and Gehlenborg in http://www.nature.com/nmeth/journal/v11/n8/abs/nmeth.3033.html. UpSet also allows for visualization of queries on intersections and elements, along with custom queries queries implemented using Hadley Wickhams apply function. To further analyze the data contained in the intersections, the user may select additional attribute plots to be displayed alongside the UpSet plot. The user also has the the ability to pass their own plots into the function to further analyze data belonging to queries of interest. Most aspects of the UpSet plot are customizable, allowing the user to select the plot that best suits their style. Depending on how the featuers are selected, UpSet can display between 25-65 sets and between 40-100 intersections.

References

Lex et al. (2014). UpSet: Visualization of Intersecting Sets IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis 2014), vol 20, pp. 1983-1992, (2014). http://people.seas.harvard.edu/~alex/papers/2014_infovis_upset.pdf

Lex and Gehlenborg (2014). Points of view: Sets and intersections. Nature Methods 11, 779 (2014). http://www.nature.com/nmeth/journal/v11/n8/abs/nmeth.3033.html

See Also

Original UpSet Website: http://vcg.github.io/upset/about/

UpSetR github for additional examples: http://github.com/hms-dbmi/UpSetR

Examples

Run this code
movies <- read.csv( system.file("extdata", "movies.csv", package = "UpSetR"), header=TRUE, sep=";" )

require(ggplot2); require(plyr); require(gridExtra); require(grid);

between <- function(row, min, max){
  newData <- (row["ReleaseDate"] < max) & (row["ReleaseDate"] > min)
}

plot1 <- function(mydata, x){
  myplot <- (ggplot(mydata, aes_string(x= x, fill = "color"))
            + geom_histogram() + scale_fill_identity()
            + theme(plot.margin = unit(c(0,0,0,0), "cm")))
}

plot2 <- function(mydata, x, y){
  myplot <- (ggplot(data = mydata, aes_string(x=x, y=y, colour = "color"), alpha = 0.5)
            + geom_point() + scale_color_identity()
            + theme_bw() + theme(plot.margin = unit(c(0,0,0,0), "cm")))
}

attributeplots <- list(gridrows = 55,
                  plots = list(list(plot = plot1, x= "ReleaseDate",  queries = FALSE),
                         list(plot = plot1, x= "ReleaseDate", queries = TRUE),
                         list(plot = plot2, x = "ReleaseDate", y = "AvgRating", queries = FALSE),
                         list(plot = plot2, x = "ReleaseDate", y = "AvgRating", queries = TRUE)),
                   ncols = 3)

upset(movies, nsets = 7, nintersects = 30, mb.ratio = c(0.5, 0.5),
      order.by = c("freq", "degree"), decreasing = c(TRUE,FALSE))

upset(movies, sets = c("Drama", "Comedy", "Action", "Thriller", "Western", "Documentary"),
      queries = list(list(query = intersects, params = list("Drama", "Action")),
                list(query = between, params = list(1970, 1980), color = "red", active = TRUE)))

upset(movies, attribute.plots = attributeplots,
     queries = list(list(query = between, params = list(1920, 1940)),
                    list(query = intersects, params = list("Drama"), color= "red"),
                    list(query = elements, params = list("ReleaseDate", 1990, 1991, 1992))),
      main.bar.color = "yellow")

Run the code above in your browser using DataLab