Learn R Programming

SITS - Satellite Image Time Series Analysis

The SITS package is a set of tools for working with satellite image time series. Includes data retrieval from a WTSS (web time series service), different visualisation methods for image time series, smoothing methods for noisy time series, different clustering methods, including dendrograms and SOM. Matches noiseless patterns with noisy time series using the TWDTW method for shape recognition and provides machine learning methods for time series classification, including SVM, LDA, QDA, GLM, Lasso, Random Forests and Deep Learning.

Overview

The sits package is a set of tools for working with satellite image time series, that:

  • Supports for data retrieval from a web time series services, rasters files and other data sets.
  • Provides different visualisation methods for image time series.
  • Includes smoothing methods for noisy time series.
  • Enables different clustering methods, including dendrograms and SOM (Kohonen maps).
  • Matches noiseless patterns with noisy time series using the TWDTW method for shape recognition.
  • Provides machine learning methods for time series classification, including SVM, LDA, QDA, GLM, Lasso, Random Forests and Deep Learning.

sits is a convenient front-end to many analysis package that are useful for satellite image time series analysis. The package organizes time series data using tibbles, which makes it quite simple for users to manage time series data sets.

Interface to CRAN data analysis packages

sits is a convenient front-end to many packages that are useful for satellite image time series analysis. Given data sets organized as time series tibbles, the package provides easy access to many packages for time series analysis:

  • keras for Deep Learning classification.
  • e1071 for SVM models.
  • MASS for LDA and QDA models.
  • nnet for multinomial log-linear models.
  • glmnet for generalized linear models.
  • ranger for random forest methods.
  • dtwclust for time series clustering.
  • kohonen for clustering based on SOM.
  • dtwSat for access to the time-weigthed dynamic time warping algorithm.
  • signal for filtering.

sits also relies extensively on the tidyverse, data.table, raster and sf packages. The tight integration achieved by tha package makes it easier for users to perform different types of analysis on satellite image time series.

Installation

Please install the SITS package from github, making sure you have the latest version of the other packages it requires:

devtools::install_github("e-sensing/sits")

Basic data structure

After loading the library, users can print a sits tibble to see how the package organizes the data.

samples_mt_9classes[1:3,]
#> # A tibble: 3 x 7
#>   longitude latitude start_date end_date   label   coverage time_series   
#>       <dbl>    <dbl> <date>     <date>     <chr>   <chr>    <list>        
#> 1     -55.2   -10.8  2013-09-14 2014-08-29 Pasture MOD13Q1  <tibble [23 ×…
#> 2     -57.8    -9.76 2006-09-14 2007-08-29 Pasture MOD13Q1  <tibble [23 ×…
#> 3     -51.9   -13.4  2014-09-14 2015-08-29 Pasture MOD13Q1  <tibble [23 ×…

The sits tibble contains data and metadata. The first six columns contain the metadata: spatial and temporal location, label assigned to the sample, and coverage from where the data has been extracted. The spatial location is given in longitude and latitude coordinates for the "WGS84" ellipsoid. For example, the first sample has been labelled "Pasture", at location (-55.1852, -10.8387), and is considered valid for the period (2013-09-14, 2014-08-29).

To display the time series, we provide sits_plot() function to display the time series. Given a small number of samples to display, the sits_plot() function tries to group as many spatial locations together. In the following example, the first 15 samples of the "Cerrado" class all refer to the same spatial location in consecutive time periods. For this reason, these samples are plotted together.

# select the "ndvi" band
samples_ndvi.tb <- sits_select_bands(samples_mt_9classes, ndvi)
# select only the samples with the cerrado label
samples_cerrado.tb <- dplyr::filter(samples_ndvi.tb, 
                  label == "Cerrado")
# plot the first 15 samples (different dates for the same points)
sits_plot(samples_cerrado.tb[1:15,])

For a large number of samples, where the amount of individual plots would be substantial, the default visualisation combines all samples together in a single temporal interval. This plot is useful to show the spread of values for the time series of each band. The strong red line in the plot shows the median of the values, and the two orange lines are the first and third interquartile ranges.

# plot all cerrado samples together (shows the distribution)
sits_plot(samples_cerrado.tb)

Importing Data into sits

sits allows different methods of data input, including: (a) obtain data from a time series web services such as INPE's WTSS (Web Series Time Service) or EMBRAPA's SATVEG; (b) read data stored in a time series in the ZOO format [@Zeileis2005]; (c) read a time series from a TIFF RasterBrick. More services will be added in future releases.

Clustering

Clustering is a way to improve training data to use in machine learning classification models. In this regard, cluster analysis can assist the identification of structural patterns and anomalous samples. sits provides support for the agglomerative hierarchical clustering (AHC) using the DTW (dynamic time warping) distance measure.

# take a set of patterns for 2 classes
# create a dendrogram object with default clustering parameters
dendro <- sits_dendrogram(cerrado_2classes)
# plot the resulting dendrogram
sits_plot_dendrogram(cerrado_2classes, dendro)

After creating a dendrogram, we provide sits_dendro_bestcut(), a function that computes a validity index and returns the height where the cut of the dendrogram maximizes this index.

# search for the best height to cut the dendrogram
sits_dendro_bestcut(cerrado_2classes, 
                    dendro)
#>        k   height 
#>  6.00000 20.39655

This height optimises the ARI and generates 6 clusters, which are then created by function sits_cluster. We can then see the cluster frequency using sits_cluster_frequency. In the example, we note that cluster 3, unlike other clusters, includes a mix of two classes. Users can then remove this cluster with sits_cluster_removeto reduce the number of mixed-class samples.

# create 6 clusters by cutting the dendrogram at 
# the linkage distance 20.39655
clusters.tb <- 
    sits_cluster(cerrado_2classes, dendro, k = 6)
# show clusters samples frequency
sits_cluster_frequency(clusters.tb)
#>          
#>             1   2   3   4   5   6 Total
#>   Cerrado 203  13  23  80   1  80   400
#>   Pasture   2 176  28   0 140   0   346
#>   Total   205 189  51  80 141  80   746
# clear those samples with a high confusion rate in a cluster 
clean.tb <- sits_cluster_remove(clusters.tb, 
                        min_perc = 0.9)
# show clean clusters samples frequency
sits_cluster_frequency(clean.tb)
#>          
#>             1   2   4   5   6 Total
#>   Cerrado 203  13  80   1  80   377
#>   Pasture   2 176   0 140   0   318
#>   Total   205 189  80 141  80   695

Filtering

Satellite image time series are contaminated by atmospheric influence and directional effects. To make the best use of available satellite data archives, methods for satellite image time series analysis need to deal with data sets that are noisy and non-homogeneous. For data filtering, sits supports Savitzky–Golay (sits_sgolay()), Whittaker (sits_whittaker()), envelope (sits_envelope()) and the "cloud filter" (sits_cloud_filter()). As an example, we show how to apply the Whitakker smoother to the data.

# Take the NDVI band of the first sample data set
point.tb <- sits_select_bands(prodes_226_064[1,], ndvi)
# apply Whitaker filter
point_whit.tb <- sits_whittaker(point.tb)
# plot the series
sits_plot(sits_merge(point_whit.tb, 
                     point.tb))

Machine Learning and Deep Learning

sits explores the full depth of satellite image time series data for classification. It treat time series as a feature vector, formed by all pixel "bands". The idea is to have as many temporal attributes as possible, increasing the dimension of the classification space. In this scenario, statistical learning models are the natural candidates to deal with high-dimensional data: learning to distinguish all land cover and land use classes from trusted samples exemplars, also known as training data, to infer classes of a larger data set.

We support a number of machine learning techniques, including SVM (support vector machines), Random Forests, generalised linear models, and gradient boosting machines, and deep learning. We show an example of using the SVM and Deep Learning classifiers below.

# Build a machine learning model with a set of samples 
# for the Mato Grosso region (provided by EMBRAPA) (samples_mt_ndvi)
svm_model <- sits_train(samples_mt_ndvi, ml_method = sits_svm(kernel = "radial",  cost = 10))
# get a point to be classified (point_ndvi)

class.tb <- sits_classify(point_ndvi, svm_model)
sits_plot(class.tb)

Code status

JobStatus
Build
Check
Documentation
Coverage

License

The sits package is licensed under the GPLv3 (http://www.gnu.org/licenses/gpl.html).

Copy Link

Version

Install

install.packages('sits')

Monthly Downloads

526

Version

1.12.0

License

GPL-2 | file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Pedro Andrade

Last Published

February 13th, 2025

Functions in sits (1.12.0)

.sits_xy_inside_raster

Tests if an XY position is inside a ST Raster Brick
.print_confusion_matrix

Print the values of a confusion matrix
.sits_satveg_timeline

Retrieve a timeline for the SATVEG service
.sits_apply_ts

Apply a function over a set of time series.
.sits_check_service

Check that the service is valid
.sits_class_info

Define the information required for classifying time series
.sits_twdtw_breaks

Classify a sits tibble using the matches found by the TWDTW methods
cerrado_2classes

Samples of classes Cerrado and Pasture
.sits_check_classify_params

Check clasification parameters
.sits_check_results

Check the results of the classification with the input data
.sits_color_name

Brewer color schemes
.sits_create_folds

Create partitions of a data set
.sits_create_coverage

Creates a coverage metadata
.sits_coverage_raster

Create a metadata tibble to store the description of a spatio-temporal raster dataset
.sits_from_twdtw_matches

Transform patterns from TWDTW format to sits format
.sits_from_shp

Obtain timeSeries from WTSS server, based on a SHP file.
.sits_get_satveg_timeline_from_txt

Retrieve a timeline from the SATVEG service based on text expression
.sits_from_raster

Extract a time series from a ST raster data set
.sits_convert_resolution

Convert resolution from projection values to lat/long
.sits_from_wtss

Obtain one timeSeries from WTSS server and load it on a sits tibble
.sits_from_csv

Obtain timeSeries from time series server, based on a CSV file.
.sits_get_account

Get an account to access a time series service
.sits_get_scale_factors

Retrieve the scale factor for a given band for an image product
.sits_coverage_satveg

Provides information about one coverage of the SATVEG time series service
.sits_function_factory

Create a closure for calling functions with and without data
.sits_get_protocol

Retrieve the protocol associated to the time series service
.sits_coverage_wtss

Provides information about one coverage of the WTSS service
.sits_get_resolution

Retrieve the pixel resolution for an image product
.sits_get_missing_values

Retrieve the missing values for a given band for an image product
.sits_get_services

List the time series services available
.sits_get_projection

Retrieve the projection for the product available at service
.sits_get_size

Retrieve the size of the product for a given time series service
.sits_is_valid_start_date

Test if starting date fits with the timeline
.sits_coverage_raster_classified

Create a set of RasterLayer objects to store time series classification results
.sits_get_server

Retrieve the time series server for the product
.sits_estimate_processing_time

Estimate the processing time
.sits_create_raster_coverage

Creates a tibble with information about a set of raster bricks
.sits_predict_interval

Classify one interval of data
.sits_kalman_filter

Compute the Kalman filter
.sits_extract

Extract a subset of the data based on dates
.sits_log_csv

Saves a CSV data set
.sits_plot_classification

Plot classification results
.sits_latlong_to_proj

Coordinate transformation (lat/long to X/Y)
.sits_classify_distances

Classify a distances tibble using machine learning models
.sits_get_bbox

Retrieve the bounding box for the product available at service
.sits_match_indexes

Find indexes in a timeline that match the reference dates
.sits_ggplot_series

Plot one timeSeries using ggplot
.sits_get_bands

Retrieve the bands avaliable for the product in the time series service
.sits_log_error

Logs an error in the log file
.sits_get_timeline

Retrieve the default timeline for a product for a given time series service
.sits_plot_twdtw_matches

Plot matches between a label pattern and a time series using the dtwSat package
.sits_ggplot_together

Plot many timeSeries together using ggplot
.sits_plot_allyears

Plot all time intervals of one time series for the same lat/long together
.sits_classify_multicores

Classify a raster chunk using multicores
.sits_group_by

Group the contents of a sits tibble by different criteria
.sits_raster_blocks

Define a reasonable block size to process a RasterBrick
.sits_proj_to_latlong

Coordinate transformation (X/Y to lat/long)
.sits_ts_from_satveg

Retrieve a time series from the SATVEG service
.sits_get_maximum_values

Retrieve the maximum values for a given band
.sits_get_tcap_wetness

Retrieve the vector of coeficientes for wetness component of tasseled cap
.sits_estimate_nblocks

Estimate the number of blocks
.sits_from_satveg

Obtain one timeSeries from the EMBRAPA SATVEG server and load it on a sits tibble
.sits_get_minimum_values

Retrieve the minimum values for a given band
.sits_get_time_index

Find the time index of the blocks to be extracted for classification
.sits_plot_patterns

Plot classification patterns
.sits_from_service

Obtain timeSeries from time series service
.sits_timeline

Obtains the timeline for a coverage
.sits_get_tcap_brightness

Retrieve the vector of coeficientes for brightness component of tasseled cap
.sits_guess_satellite

Try a best guess for the type of sensor/satellite
.sits_labels_list

Sits labels processing function
.sits_log_debug

Logs an error in the log file
.sits_labelling_neurons

Labelling neurons using majority voting
.sits_get_tcap_greenness

Retrieve the vector of coeficientes for brightness component of tasseled cap
.sits_log_data

Saves a data set for future use
.sits_plot_twdtw_alignments

Plot classification alignments using the dtwSat package
.sits_is_valid_end_date

Test if end date fits inside the timeline
.sits_preprocess_data

Preprocess a set of values retrived from a raster brick
.sits_write_raster_values

Write the values and probs into raster files
sits_cloud_filter

cloud filter
.sits_read_data

Read a block of values retrived from a set of raster bricks
.sits_tibble_csv

Create an empty tibble to store the results of CSV samples that could not be read
.sits_plot_twdtw_classification

Plot classification results using the dtwSat package
.sits_raster_filename

Define a filename associated to one classified raster layer
.sits_test_tibble

Tests if a sits tibble is valid
point_ndvi

A time series sample for the NDVI band from 2000 to 2016
.sits_to_twdtw

Export data to be used by the dtwSat package
%>%

Pipe
prodes_226_064

Samples of deforestation-related classes for the LANDSAT image WRS 226/064
point_mt_6bands

A time series sample with data from 2000 to 2016
sits_apply

Apply a function over sits bands.
sits_cluster_remove

Remove cluster with mixed classes
sits_distances

Use time series values from a sits tibble as distances for training patterns
sits_get_data

Obtain time series from different sources
sits_cluster

Cuts a cluster tree produced by sits_dendrogram
sits_cluster_validity

Cluster validity indices
sits_get_color

Retrieve the color associated to a class
sits_dendrogram

Compute a dendrogram using hierarchical clustering
.sits_neighbor_neurons

Get the neighbor of neurons
sits_metrics_by_cluster

Metrics by cluster
.sits_normalization_param

Normalize the time series in the given sits_tibble
sits_missing_values

Remove missing values
sits_plot

Plot a set of satellite image time series
sits_classify_raster

Classify a set of spatio-temporal raster bricks using multicore machines
sits_dendro_bestcut

Compute validity indexes to a range of cut height
sits_match_timeline

Find dates in the input coverage that match those of the patterns
sits_classify

Classify a sits tibble using machine learning models
sits_log

Creates a directory and logfile for logging information and debug
sits_plot_dendrogram

Plot a dendrogram
sits_plot_kohonen

Plot the SOM grid with neurons labeled
sits_deeplearning

Train a sits classifiction model using the keras deep learning
.sits_max_colors

Brewer color schemes
.sits_plot_title

Create a plot title to use with ggplot
sits_plot_cluster_info

Plot information about clusters
.sits_mem_used

Shows the memory used in GB
samples_mt_9classes

Samples of nine classes for the state of Mato Grosso used for classification
.sits_pred_ref

Obtains the predicted value of a reference set
.sits_select_raster_indexes

Provide a list of indexes to extract data from a raster-derived data table for classification
.sits_predict

Predict class based on the trained models
.sits_plot_together

Plot a set of time series for the same spatio-temporal reference
sits_data_to_csv

Export a sits tibble data to the CSV format
.sits_split_data

Split a data.table or a matrix for multicore processing
.sits_normalize_matrix

Normalize the time series values in the case of a matrix
.sits_num_samples

Find number of samples, given a timeline and an interval
samples_mt_ndvi

Samples of nine classes for the state of Mato Grosso for the NDVI band
.sits_scale_matrix_integer

Scale the time series values in the case of a matrix
sits_bands

Informs the names of the bands of a time series
sits_cluster_clean

Cluster cleaner
sits_plot_raster

Plot a raster classified images
sits_plot_subgroups

Plot the patterns of subgroups
sits_subgroup

Create new groups from kohonen maps
sits_shp_to_csv

Export a shapefile with points to a CSV file for later processing
sits_dates

Return the dates of a sits tibble
sits_formula_linear

Define a linear formula for classification models
sits_formula_logref

Define a loglinear formula for classification models
sits_rfor

Train a sits classifiction model using fast random forest algorithm
sits_sample

Sample a percentage of a time series
sits_tibble

Create a sits tibble to store the time series information
sits_sgolay

Smooth the time series using Savitsky-Golay filter
.sits_tibble_prediction

Create an empty tibble to store the results of predictions
.sits_sample_distances

Sample a percentage of a time series distance matrix
sits_accuracy_area

Area-weighted post-classification accuracy assessment of classified maps
sits_align

Aligns dates of time series to a reference date
.sits_time_index

Create a list of time indexes from the dates index
.sits_scale_data

Scale the time series values in the case of a matrix
sits_kalman

Kalman filter
sits_keras_diagnostics

Provides access to diagnostic information about a Keras deep learning model
sits_linear_interp

Interpolation function of the time series in a sits tibble
sits_load_keras

Load a Keras model for processing in sits
sits_show_config

Shows the contents of the sits configuration file
sits_break

Breaks a set of time series into equal intervals
.sits_select_indexes

Provide a list of indexes to extract data from a distance table for classification
:=

Set by reference in data.table
sits_bayes_postprocess

Post-process a classified data raster with bayesian filter
sits_twdtw_classify

Find matches between a set of sits patterns and segments of sits tibble using TWDTW
sits_csv_error_file

Loads the CSV error file saved in the log directory
sits_coverage

Provides information about one coverage used to retrieve data
sits_envelope

Envelope filter
sits_to_xlsx

Saves the results of accuracy assessments as Excel files
sits_cluster_frequency

Cluster contigency table
sits_evaluate_samples

Evaluate samples
sits_from_zoo

Import time series in the zoo format to a sits tibble
sits_info_wtss

Provides information about WTSS service
sits_formula_smooth

Define a smoothing formula for classification models
sits_interp

Interpolation function of the time series of a sits_tibble
sits_kfold_validate

Cross-validate temporal patterns
sits_normalize_data

Normalize the time series in the given sits_tibble
sits_svm

Train a sits classification model using a support vector machine
sits_tasseled_cap

Builds tasseled cap bands
sits_patterns

Create temporal patterns using a generalised additive model (gam)
sits_kohonen

Clustering a set of satellite image time series using SOM
sits_metadata_to_csv

Export a sits tibble metadata to the CSV format
sits_prune

Checks that the timeline of all time series of a data set are equal
sits_qda

Train a sits classification model using quadratic discriminant analysis
sits_merge

Merge two satellite image time series
sits_select_bands_

Filter bands on a sits tibble
sits_mlr

Train a sits classifiaction model using multinomial log-linear regions via neural networks
sits_mutate_bands

Add new sits bands.
sits_transmute_bands

Add new sits bands and drops existing.
sits_services

Provides information about the time series services available
sits_conf_matrix

Assessment of the accuracy of classification based on a confusion matrix
sits_values

Return the values of a given sits tibble as a list of matrices according to a specified format.
sits_savi

Builds soil-adjusted vegetation index
sits_show_debug

Prints the debug log
sits_save_keras

Save a Keras model for later processing in sits
sits_config

Reads a configuration file and loads it in the main environment
sits_info_services

Provides information about time series service
sits_lda

Train a sits classification model using linear discriminant analysis
sits_ndvi_arima_filter

NDVI filter with ARIMA model
sits_labels

Returns the information about labels of a sits tibble
sits_get_memory_bloat

Retrieve the estimated value of R memory bloat
sits_show_errors

Prints the error log
sits_whittaker

Smooth the time series using Whittaker smoother
sits_ndwi

Builds normalized difference water index
timeline_2000_2017

The timeline for the sequence of images for MOD13Q1 collection 6
sits_relabel

Relabels a sits tibble
timeline_modis_392

The timeline for the sequence of images for MOD13Q1 collection 5
sits_rename

Names of the bands of a time series
ts_zoo

A time series in the ZOO format
sits_select_bands

Filter bands on a sits tibble
sits_select

General selection criteria for subsetting a sits tibble
sits_train

Train sits classification models
sits_to_zoo

Export data to be used to the zoo format