The microbiome
data contains paired DNA samples from before treatment and 21 months after treatment for helminth infections martin2019miscorematchingad.
This data was analysed by martin2019mi;textualscorematchingad and a further subset was studied by scealy2023sc;textualscorematchingad.
The data are from a study into the effect of helminth infections on the course of malaria infections (ImmunoSPIN-Malaria) in the Nangapanda subdistrict, Indonesia wiria2010doscorematchingad.
As part of the study, some participants were given 400mg of albendazole every three months for 1.5 years,
remaining participants were given a placebo
wiria2010doscorematchingad.
microbiome
A dataframe with 300 rows (two rows per individual) and 31 columns:
An integer uniquely specifying the individual.
The collection year for the sample. 2008
for before treatment. 2010
for after treatment.
1
if female, 0
otherwise.
TRUE
if individual given 400mg of albendazole every three months for 1.5 years, FALSE
otherwise.
Age at first sample.
A Helminth measurement: The qPCR cycle threshold (CT) for Ascaris lumbricoides (large roundworm). Ascaris lumbricoides can be considered present if the value is 30 or less.
A Helminth measurement: The qPCR cycle threshold (CT) for Necator americanus (a hookworm). Necator americanus can be considered present if the value is 30 or less.
A Helminth measurement: The qPCR cycle threshold (CT) for Ancylostoma duodenale (a hookworm). Ancylostoma duodenale can be considered present if the value is 30 or less.
A Helminth measurement: The presence of Trichuris trichiura as determined by microscopy. A value of TRUE
means Trichuris trichiura was detected.
A Helminth measurement: If any of the above helminths were detected then TRUE
, otherwise FALSE
.
Count prevalence of 18 bacterial phyla and 2 unclassified columns.
The microbiome
data was created from the file S1_Table.xlsx
hosted on Nematode.net
at
http://nematode.net/Data/environmental_interaction/S1_Table.xlsx
using the below code.
microbiome <- readxl::read_excel("S1_Table.xlsx",
range = "A3:AE303") #avoids the genus data, keeping - only phyla
metacolnames <- readxl::read_excel("S1_Table.xlsx",
range = "A2:J2",
col_names = FALSE)
colnames(microbiome)[1:ncol(metacolnames)] <- metacolnames[1, ]
colnames(microbiome)[2] <- "Year"
microbiome[, 11] <- (microbiome$ct_Al <= 30) | (microbiome$ct_Na <= 30) |
(microbiome$ct_Ad <= 30) | (microbiome$ct_St <= 30) |
(microbiome$micr_Tt == 1)
colnames(microbiome)[11] <- "Helminth"
microbiome <- microbiome |>
dplyr::mutate(across(c(1,2,3,12:31), as.integer)) |>
dplyr::mutate(micr_Tt = as.logical(micr_Tt),
Treatment = as.logical(Treatment)) |>
dplyr::rename(IndividualID = `Individual ID`)
microbiome <- as.data.frame(microbiome)
The measurements in the data come from stool samples before and after treatment. Gut microbiome prevalence was measured using 16s rRNA 454 sequencing martin2019miscorematchingad. Helminth infections were detected by PCR or microscopy martin2019miscorematchingad.
The subset studied by scealy2023sc;textualscorematchingad contained only the measurements from before treatment, and only those individuals with a helminth infection. These measurements can be obtained by running
microbiome[(microbiome$Year == 2008) & microbiome$Helminth, ]
Two further individuals (IndividualID
of 2079
and 2280
) were deemed outliers by scealy2023sc;textualscorematchingad.