Learn R Programming

topolow (version 1.0.0)

generate_complex_data: Generate Complex High-Dimensional Data for Testing

Description

Generates synthetic high-dimensional data with clusters and trends for testing dimensionality reduction methods. Creates data with specified properties:

  • Multiple clusters along a trend line

  • Variable density regions

  • Controllable noise levels

  • Optional visualization

The function generates cluster centers along a trend line, adds points around those centers with specified spread, and incorporates random noise to create high and low density areas. The data is useful for testing dimensionality reduction and visualization methods.

Usage

generate_complex_data(
  n_points = 500,
  n_dim = 10,
  n_clusters = 4,
  cluster_spread = 1,
  fig_name = NA
)

Value

A data.frame with n_points rows and n_dim columns. Column names are "Dim1" through "DimN" where N is n_dim.

Arguments

n_points

Integer number of points to generate

n_dim

Integer number of dimensions

n_clusters

Integer number of clusters

cluster_spread

Numeric controlling cluster variance

fig_name

Character path to save visualization (optional)

Examples

Run this code
# Generate basic dataset
data <- generate_complex_data(n_points = 500, n_dim = 10, 
                             n_clusters = 4, cluster_spread = 1)
                             
# The function returns a data frame, which can be inspected
head(data)

Run the code above in your browser using DataLab