The function performs a number of checks on the two main datasets used
as input data, to make sure datasets are formatted in a way suitable
for the other functions in the package. This should make it easier for
users to correctly construct datasets before starting with analyses.
Two datasets are needed to use the full set of analyses included in
the package, and these can be checked for formatting issues.
The first dataset should contain data on the proportions
of different compounds (columns) in different samples (rows).
Note that all calculations of diversity, and most calculations of
dissimilarity, are only performed on relative, rather than absolute,
values. The second dataset should contain, in each of three
columns in a data frame, the compound name, SMILES and InChIKey IDs of
all the compounds present in the first dataset. See
chemodiv for details on obtaining SMILES and InChIKey IDs.
For compound names, avoid starting with a number, avoid using Greek letters
or special characters, and ensure there are no trailing spaces.