a list with two dataframes: the topic_weights dataframe has the first column being the individual id, the other columns are the patient topic weights mapped to the topic loadings;
The second dataframe column incidence_weight_sum is eid and the cumulative topic weights across all disease diagnoses.
Arguments
data
the set of diseases, formatted same way as HES_age_example
ds_list
a list of diseases that correspond to the topic loadings that patients are mapped to
formatted as UKB_349_disease; default is set to be UKB_349_disease.
topics
The topics that are used to map patients. Default is set to be UKB_HES_10topics,
which are the inferred topics from 349 Phecodes from the UK Biobank HES data.
Details of these topics are available in the paper "Age-dependent topic modelling of
comorbidities in UK Biobank identifies disease subtypes with differential genetic risk".