Learn R Programming

myTAI

Evolutionary Transcriptomics with R

library(myTAI)
# obtain an example phylo-expression object
data("example_phyex_set")
# plot away!
myTAI::plot_signature(example_phyex_set)  
myTAI::plot_contribution(example_phyex_set)
myTAI::plot_gene_space(example_phyex_set)

Detailed documentation provided here

Package summary

Using myTAI, any existing or newly generated transcriptome dataset can be combined with evolutionary information (find details here) to retrieve novel insights about the evolutionary conservation of the transcriptome at hand.

For the purpose of performing large scale evolutionary transcriptomics studies, the myTAI package implements the quantification, statistical assessment, and analytics functionality to allow researchers to study the evolution of biological processes by determining stages or periods of evolutionary conservation or variability in transcriptome data.

We hope that myTAI will become the community standard tool to perform evolutionary transcriptomics studies and we are happy to add required functionality upon request.

In the past years, a variety of studies aimed to uncover the molecular basis of morphological innovation and variation from the evolutionary developmental perspective. These studies often rely on transcriptomic data to establish the molecular patterns driving the complex biological processes underlying phenotypic plasticity.

Although transcriptome information is a useful start to study the molecular mechanisms underlying a biological process of interest (molecular phenotype), they rarely capture how these expression patterns emerged in the first place or to what extent they are possibly constrained, thereby neglecting the evolutionary history and developmental constraints of genes contributing to the overall pool of expressed transcripts.

To overcome this limitation, the myTAI package introduces procedures summarized under the term evolutionary transcriptomics to integrate gene age information into classical gene expression analysis. Gene age inference can be performed with various existing software, but we recommend using GenEra or orthomap, since they address published shortcomings of gene age inference (see detailed discussion here). In addition, users can easily retrieve previously precomputed gene age information via our data package phylomapr.

Evolutionary transcriptomics studies can serve as a first approach to screen in silico for the potential existence of evolutionary and developmental constraints within a biological process of interest. This is achieved by quantifying transcriptome conservation patterns and their underlying gene sets in biological processes. The exploratory analysis functions implemented in myTAI provide users with a standardized, automated and statistically sound framework to detect and analyze patterns of evolutionary constraints in any transcriptome dataset of interest.

Today, phenomena such as morphological mutations, diseases or developmental processes are primarily investigated on the molecular level using transcriptomics approaches. Transcriptomes denote the total number of quantifiable transcripts present at a specific stage in a biological process. In disease or developmental (defect) studies, transcriptomes are usually measured over several time points. In treatment studies aiming to quantify differences in the transcriptome due to biotic stimuli, abiotic stimuli, or diseases usually treatment / disease versus non-treatment / non-disease transcriptomes are compared. In either case, comparing changes in transcriptomes over time or between treatments allows us to identify genes and gene regulatory mechanisms that might be involved in governing the biological process of investigation. Although classic transcriptomics studies are based on an established methodology, little is known about the evolution and conservation mechanisms underlying such transcriptomes. Understanding the evolutionary mechanism that change transcriptomes over time, however, might give us a new perspective on how diseases emerge in the first place or how morphological changes are triggered by changes of developmental transcriptomes.

Evolutionary transcriptomics aims to capture and quantify the evolutionary conservation of genes that contribute to the transcriptome during a specific stage of the biological process of interest. The resulting temporal conservation pattern then enables to detect stages of development or other biological processes that are evolutionarily conserved (Drost et al., 2018). This quantification on the highest level is achieved through transcriptome indices (e.g. Transcriptome Age Index or Transcriptome Divergence Index) which aim to quantify the average evolutionary age Barrera-Redondo et al., 2023 or sequence conservation Drost et al., 2015 of genes that contribute to the transcriptome at a particular stage. In general, evolutionary transcriptomics can be used as a method to quantify the evolutionary conservation of transcriptomes at particular developmental stages and to investigate how transcriptomes underlying biological processes are constrained or channeled due to events in evolutionary history (Dollo's law) (Drost et al., 2017).

Please note, since myTAI relies on gene age inference and there has been an extensive debate about the best approaches for gene age inference in the last years, please follow my updated discussion about the gene age inference literature. With GenEra, we addressed all previously raised issues and we encourage users to run GenEra when aiming to infer gene ages for further myTAI analyses.

Installation

Install myTAIv2:

install.packages("myTAI", dependencies = TRUE)

To install the old version of myTAI, and access the old vignettes, do:

devtools::install_github("drostlab/myTAI@v1.0")

Citation

Please cite the following paper when using myTAI for your own research. This will allow us to continue working on this software tool and will motivate us to extend its functionality and usability in the next years. Many thanks in advance!

Drost et al. myTAI: evolutionary transcriptomics with R. Bioinformatics 2018, 34 (9), 1589-1590. doi:10.1093

  • Evolutionary trends in the emergence of skeletal cell types

A Damatac, II , S Koska , K K Ullrich , T Domazet-Lošo , A Klimovich , M Kaucká… - Evolution Letters, 2025

  • Phylostratigraphic analysis revealed that ancient ohnologue PtoWRKY53 innovated a vascular transcription regulatory network in Populus

W Huang, M Quan, W Qi, L Xiao, Y Fang, J Zhou… - New Phytologist, 2025

  • Pra-GE-ATLAS: Empowering Pinus radiata stress and breeding research through a multi-omics database

V Roces, MJ Cañal, JL Mateo, L Valledor… - Journal of Integrative Plant Biology, 2025

  • Developmental phylotranscriptomics in grapevine suggests an ancestral role of somatic embryogenesis

S Koska, D Leljak-Levanic, N Malenica, K Bigovic Villi… - Communications Biology, 2025

  • Proteomic analyses reveal the key role of gene co-option in the evolution of the scaly-foot snail scleritome

WC Wong, YH Kwan, X He, C Chen, S Xiang, Y Xiao… - Communications Biology, 2025

  • Genome assembly of Stewartia sinensis reveals origin and evolution of orphan genes in Theaceae

L Cheng, Q Han, Y Hao, Z Qiao, M Li, D Liu… - Communications Biology, 2025

  • A transcriptomic hourglass in brown algae

JS Lotharukpong, M Zheng, R Luthringer, D Liesner, H-G Drost, SM Coelho - Nature, 2024

  • Genome assemblies of 11 bamboo species highlight diversification induced by dynamic subgenome dominance

PF Ma, YL Liu, C Guo, G Jin, ZH Guo, L Mao, YZ Yang… - Nature Genetics, 2024

  • Hemichordate cis-regulatory genomics and the gene expression dynamics of deuterostomes

A Pérez-Posada, CY Lin, TP Fan, CY Lin, YC Chen… - Nature Ecology & Evolution, 2024

  • Comparison between 16S rRNA and shotgun sequencing in colorectal cancer, advanced colorectal lesions, and healthy human gut microbiota

D Bars-Cortina, E Ramon, B Rius-Sansalvador… - BMC genomics, 2024

  • Heat stress reprograms herbivory-induced defense responses in potato plants

J Zhong, J Zhang, Y Zhang, Y Ge, W He, C Liang… - BMC Plant Biology, 2024

  • The transcriptomic signature of adaptations associated with perfume collection in orchid bees

K Darragh, SR Ramírez - Journal of Evolutionary Biology, 2024

  • Proteomic dynamics revealed sex‐biased responses to combined heat‐drought stress in Marchantia

S Guerrero, V Roces, L García‐Campa, L Valledor… - Journal of Integrative Plant Biology, 2024

  • Evolution of gene networks underlying adaptation to drought stress in the wild tomato Solanum chilense

K Wei, S Sharifova, X Zhao, N Sinha, H Nakayama… - Molecular Ecology, 2024

  • Conserved and specific gene expression patterns in the embryonic development of tardigrades

C Li, Z Yang, X Xu, L Meng, S Liu, D Yang - Evolution & Development, 2024

  • The functions and factors governing fungal communities and diversity in agricultural waters: insights into the ecosystem services aquatic mycobiota provide

P Pham, Y Shi, I Khan, M Sumarah, J Renaud… - Frontiers in Microbiology, 2024

  • An evolutionary timeline of the oxytocin signaling pathway

AM Sartorius, J Rokicki, S Birkeland, F Bettella, C Barth… - Communications Biology, 2024

  • The Evolution of Foraging Webs is Associated with Young Genes in Araneoidea Spiders

A Jia, T Yang, W Hu, S Ma, Z Zhang, Y Wang - Available at SSRN 4383994

  • Multiplexed transcriptomic analyses of the plant embryonic hourglass

H Wu, R Zhang, KJ Niklas, MJ Scanlon - BioRxiv, 2024

  • Brachiopod genome unveils the evolution of the BMP–Chordin network in bilaterian body patterning

TD Lewin, K Shimizu, IJY Liao, ME Chen, K Endo… - BioRxiv, 2024

  • The angiosperm seed life cycle follows a developmental reverse hourglass

AA Sami, L Bentsink, MAS Artur - BioRxiv, 2024

  • Evolutionary trends in the emergence of skeletal cell types

A Damatac, S Koska, KK Ullrich, T Domazet-Loso… - BioRxiv, 2024

  • Transcriptome age of individual cell types in Caenorhabditis elegans F Ma, C Zheng - Proceedings of the National Academy of Sciences, 2023

  • Single-cell atlases of two lophotrochozoan larvae highlight their complex evolutionary histories

L Piovani, DJ Leite, LA Yañez Guerra, F Simpson… - Science Advances, 2023

  • oggmap: a Python package to extract gene ages per orthogroup and link them with single-cell RNA data

KK Ullrich, NE Glytnasi - Bioinformatics, 2023

  • Discovery of putative long non-coding RNAs expressed in the eyes of Astyanax mexicanus (Actinopterygii: Characidae)

I Batista da Silva, D Aciole Barbosa, KF Kavalco… - Scientific Reports, 2023

  • An ancient split of germline and somatic stem cell lineages in Hydra

C Nishimiya-Fujisawa, H Petersen, TC-T Koubková-Yu, C Noda, S Shigenobu, J Bageritz, T Fujisawa, O Simakov, S Kobayashi, TW Holstein - BioRxiv, 2023

  • Oxytocin receptor expression patterns in the human brain across development

J Rokicki, T Kaufmann, A-MG de Lange, D van der Meer, S Bahrami, AM Sartorius, UK Haukvik, NE Steen, E Schwarz, DJ Stein, T Nærland, OA Andreassen, LT Westlye, DS Quintana - Neuropsychopharmacology, 2022

  • The Phylotranscriptomic Hourglass Pattern in Fungi: An Updated Model Y Xie, HS Kwan, PL Chan, WJ Wu, J Chiou, J Chang BioRxiv, 2022

  • Embryo-Like Features in Developing Bacillus subtilis Biofilms M Futo, L Opašić, S Koska, N Čorak, T Široki, V Ravikumar, A Thorsell, M Lenuzzi, D Kifer, M Domazet-Lošo, K Vlahoviček, I Mijakovic, T Domazet-Lošo - Molecular Biology and Evolution, 2021

  • New Genes Interacted With Recent Whole-Genome Duplicates in the Fast Stem Growth of Bamboos G Jin, P-F Ma, X Wu, L Gu, M Long, C Zhang, DZ Li - Molecular Biology and Evolution, 2021

  • Evolutionary transcriptomics of metazoan biphasic life cycle supports a single intercalation origin of metazoan larvae

J Wang, L Zhang, S Lian, Z Qin, X Zhu, X Dai, Z Huang et al. - Nature Ecology & Evolution, 2020

  • Pervasive convergent evolution and extreme phenotypes define chaperone requirements of protein homeostasis

Y Draceni, S Pechmann - Proceedings of the National Academy of Sciences, 2019

  • Reconstructing the transcriptional ontogeny of maize and sorghum supports an inverse hourglass model of inflorescence development

S Leiboff, S Hake - Current Biology, 2019

  • The Transcriptional Landscape of Polyploid Wheats and their Diploid Ancestors during Embryogenesis and Grain Development

D Xiang, TD Quilichini, Z Liu, P Gao, Y Pan et al. - The Plant Cell, 2019

  • Pervasive convergent evolution and extreme phenotypes define chaperone requirements of protein homeostasis

Y Draceni, S Pechmann - Proceedings of the National Academy of Sciences, 2019

  • A unicellular relative of animals generates a layer of polarized cells by actomyosin-dependent cellularization

O Dudin, A Ondracka, X Grau-Bové, AAB Haraldsen et al. - eLife, 2019

  • Gene Expression Does Not Support the Developmental Hourglass Model in Three Animals with Spiralian Development

L Wu, KE Ferger, JD Lambert - Molecular Biology and Evolution, 2019

  • Phylostratr: a framework for phylostratigraphy

Z Arendsee, J Li, U Singh, A Seetharam et al. - Bioinformatics, 2019

  • Algorithms for synteny-based phylostratigraphy and gene origin classification

Z Arendsee - 2019

  • Elucidating the endogenous synovial fluid proteome and peptidome of inflammatory arthritis using label-free mass spectrometry

SM Mahendran, EC Keystone, RJ Krawetz et al. - Clinical proteomics, 2019

  • Environmental DNA reveals landscape mosaic of wetland plant communities

ME Shackleton, GN Rees, G Watson et al. - Global Ecology and Conservation, 2019

  • Developmental constraints on genome evolution in four bilaterian model species

J Liu, M Robinson-Rechavi - Genome Biology and Evolution, 2018

  • Mapping selection within Drosophila melanogaster embryo's anatomy

I Salvador-Martínez et al. - Molecular Biology and Evolution, 2017

  • Distribution and diversity of enzymes for polysaccharide degradation in fungi

R Berlemont - Scientific reports, 2017

  • The origins and evolutionary history of human non-coding RNA regulatory networks

M Sherafatian, SJ Mowla - Journal of bioinformatics and computational biology, 2017

  • High expression of new genes in trochophore enlightening the ontogeny and evolution of trochozoans

F Xu, T Domazet-Lošo, D Fan, TL Dunwell, L Li et al. - Scientific reports, 2016

  • Evidence for active maintenance of phylotranscriptomic hourglass patterns in animal and plant embryogenesis

HG Drost, A Gabel, I Grosse, M Quint - Molecular Biology and Evolution, 2015

NEWS

The current status of the package as well as a detailed history of the functionality of each version of myTAI can be found in the NEWS section.

Tutorials

The following tutorials will provide use cases and detailed explanations of how to quantify transcriptome conservation with myTAI and how to interpret the results generated with this software tool.

Main:

Advanced:

Users can also read the tutorials within (RStudio) :

# source the myTAI package
library(myTAI)

# look for all tutorials (vignettes) available in the myTAI package
# this will open your web browser
browseVignettes("myTAI")

Object classes in myTAI

Workflow to load your own dataset:

bulk

bulk with replicates

single cell

From expression matrix

Or directly from a seurat object:

Discussions and Bug Reports

We would be very happy to learn more about potential improvements of the concepts and functions provided in this package.

Furthermore, in case you find some bugs or need additional (more flexible) functionality of parts of this package, please let us know:

https://github.com/drostlab/myTAI/issues

Copy Link

Version

Install

install.packages('myTAI')

Monthly Downloads

424

Version

2.3.4

License

GPL-2

Maintainer

Hajk-Georg Drost

Last Published

November 11th, 2025

Functions in myTAI (2.3.4)

TXI_std_dev

Standard Deviation for TXI
TXI_conf_int

Confidence Intervals for Transcriptomic Index (TXI)
TXI

Calculate Transcriptomic Index (TXI)
TDI

Calculate Transcriptomic Divergence Index (TDI)
TI_map

Transcriptomic Index Name Mapping
TAI

Calculate Transcriptomic Age Index (TAI)
TEI

Calculate Transcriptomic Evolutionary Index (TEI)
TPI

Calculate Transcriptomic Polymorphism Index (TPI)
age.apply

Age Category Specific apply Function
TestResult

Test Result S7 Class
conf_int

Calculate Confidence Intervals for Test Result
check_ScPhyloExpressionSet

Check if object is a ScPhyloExpressionSet
consensus

Calculate Consensus Gene Set
collapse

Collapse PhyloExpressionSet Replicates
as_data_frame

Convert BulkPhyloExpressionSet to Data Frame
as_BulkPhyloExpressionSet

as_BulkPhyloExpressionSet
check_PhyloExpressionSet

Check if object is a PhyloExpressionSet
.collapse_replicates

Collapse Expression Data Across Replicates
check_BulkPhyloExpressionSet

Check if object is a BulkPhyloExpressionSet
diagnose_test_robustness

Diagnose Test Robustness
destroy_pattern

Destroy Phylotranscriptomic Pattern Using GATAI
.TXI_sc_adaptive

Adaptive TXI calculation for single cell expression
convergence_plots

Create Convergence Plots for GATAI Analysis
.TXI_sc

Calculate TXI for single cell expression sparse matrix.
cpp_txi_sc

Calculate TXI for Single-Cell Expression Data (C++ Implementation)
.compute_reduction

Compute Dimensional Reduction
distributions

Predefined Distribution Objects
downsample

Downsample ScPhyloExpressionSet
.fit_normal

Fit Normal Distribution Parameters
.memo_generate_conservation_txis

Memoized Null Conservation TXI Generation
.TXI

Calculate TXI for Raw Expression Data
.fit_gamma

Fit Gamma Distribution Parameters
.plot_gene_heatmap_impl

Shared Gene Heatmap Implementation
.prepare_sc_plot_data

Prepare Single-Cell Plot Data
.pseudobulk_expression

Create Pseudobulk Expression Data
.to_std_expr

Standardise Expression Data
genes_lowly_expressed

Select Lowly Expressed Genes
.pTXI

Calculate pTXI for Raw Expression Data
genes_filter_dynamic

Filter Dynamic Expression Genes
example_phyex_set

Example phyex set
example_phyex_set_old

Example phyex set old
downsample_expression

Downsample Expression Matrix by Groups
exp_p

Format P-Value for Scientific Notation
.get_expression_from_seurat

Get Expression Matrix from Seurat Object
ec_score

Early Conservation Score Function
.get_p_value

Calculate P-Value from Distribution
early_gene

Early Expression Pattern
example_phyex_set_sc

Load Example Single-Cell PhyloExpressionSet
late_gene

Late Expression Pattern
match_map_sc_matrix

Match Expression Matrix with Phylostratum Map
match_map

Match Gene Expression Data with Phylostratum Map
gatai_animate_destruction

Animate GATAI Destruction Process
get_angles

Calculate Gene Expression Angles
lc_score

Late Conservation Score Function
full_gatai_convergence_plot

Create Full GATAI Convergence Plot
goodness_of_fit

Goodness of Fit Test
genes_top_expr

Gene Expression Filtering Functions
normalise_stage_expression

Normalise Stage Expression Data
genes_top_mean

Select Top Mean Expressed Genes
new_required_property

Create S7 Required Property
geom_mean

Geometric Mean
genes_top_variance

Select Top Variable Genes
match_map_sc_seurat

Match Single-Cell Expression Data with Phylostratum Map (Seurat)
pTXI

Calculate Phylostratum-Specific Transcriptomic Index
omit_matrix

Compute TXI Profiles Omitting Each Gene
mid_gene

Mid Expression Pattern
mod_pi

Modulo Pi Function
pair_score

Pairwise Score Function
permute_PS

Permute Strata in PhyloExpressionSet
new_options_property

Create S7 Options Property
plot_distribution_pTAI

Partial TAI Distribution Plotting Functions
plot_distribution_strata

Plot Distribution of Genes Across Phylostrata
plot_distribution_pTAI_qqplot

QQ plot comparing partial TAI distributions across developmental stages against a reference stage
plot_distribution_expression

Comparing expression levels distributions across developmental stages
plot_cullen_frey

Plot Cullen-Frey Diagram for Distribution Assessment
petal_plot

Create Petal Plot for Gene Removal Analysis
plot_gene_profiles

Plot Individual Gene Expression Profiles
plot_contribution

Plot Phylostratum Contribution to Transcriptomic Index
plot_gene_heatmap

Plot Gene Expression Heatmap
plot_gatai_results

Plot Comprehensive GATAI Results
plot_signature_transformed

Plot Signature Under Different Transformations
plot_sample_space

Plot Sample Space Visualization
plot_null_txi_sample

Plot Null TXI Sample Distribution
plot_gene_space

Plot Gene Space Using PCA
plot_relative_expression_line

Plot Relative Expression Profiles (Line Plot)
plot_signature_gene_quantiles

Plot Signature Across Gene Expression Quantiles
plot_signature

Plot Transcriptomic Signature
plot_relative_expression_bar

Plot Mean Relative Expression Levels as Barplot
plot_mean_var

Plot Mean-Variance Relationship
plot_signature_multiple

Plot Multiple Transcriptomic Signatures
relative_expression

Relative Expression Functions
quantile_rank

Calculate Quantile Ranks
plot_strata_expression

Plot Expression Levels by Phylostratum
rel_exp_matrix

Compute Relative Expression Matrix for PhyloExpressionSet
reverse_hourglass_score

Reverse Hourglass Score Function
remove_genes

Remove Genes from PhyloExpressionSet
rename_phyex_set

Rename a PhyloExpressionSet
save_gatai_results_pdf

Save GATAI Analysis Results to PDF
reductive_hourglass_score

Reductive Hourglass Score Function
rev_mid_gene

Reverse Mid Expression Pattern
print.TestResult

Print Method for TestResult
select_genes

Select Genes from PhyloExpressionSet
sTXI

Calculate Stratum-Specific Transcriptomic Index
stat_generic_conservation_test

Generic Conservation Test Framework
stat_flatline_test

Flat Line Test for Conservation Pattern
stat_generate_conservation_txis

Generate Null Conservation TXI Distribution
rowVars

Row-wise Variance Calculation
stat_late_conservation_test

Late Conservation Test
set_expression

Gene Expression Transformation Functions
stat_early_conservation_test

Early Conservation Test
transform_counts

Transform Expression Counts in PhyloExpressionSet
stat_reverse_hourglass_test

Reverse Hourglass Test
taxid

Retrieve taxonomy categories from NCBI Taxonomy
tf

Short Alias for Transform Counts
stat_reductive_hourglass_test

Reductive Hourglass Test
strata_enrichment

Calculate Phylostratum Enrichment
tf_PS

Transform Phylostratum Values
tf_stability

Perform Permutation Tests Under Different Transformations
threshold_comparison_plots

Create Threshold Comparison Plots
stat_pairwise_test

Pairwise Conservation Test
BulkPhyloExpressionSet_from_df

Convert Data to BulkPhyloExpressionSet
Distribution

Distribution S7 Class
PS_colours

Generate Phylostratum Colors
ConservationTestResult

Conservation Test Result S7 Class
ScPhyloExpressionSet_from_seurat

Convert Seurat Object to Single-Cell PhyloExpressionSet
BulkPhyloExpressionSet

Bulk PhyloExpressionSet Class
PhyloExpressionSetBase

PhyloExpressionSet Base Class
COUNT_TRANSFORMS

Count Transformation Functions
ScPhyloExpressionSet_from_matrix

Create Single-Cell PhyloExpressionSet from Expression Matrix
ScPhyloExpressionSet

Single-Cell PhyloExpressionSet Class