This function extracts metadata from XML filenames following the UFZ FTMS naming conventions. It parses elements like sample ID, position, date, and retention time, organizing them into a structured data.table.
extract_metadata_from_ufz_files(folder_path = NULL, file_type = NULL)A data.table containing extracted metadata fields from each filename. The columns are:
sample_id: Identifier for the sample.
sample_id_ufz: Identifier specific to UFZ's format, if available.
position: Position or condition identifier in the experiment.
date: Experiment date, formatted as Date.
segment: Segment information related to time or experiment phase.
ret_time: Retention time range within the segment.
file_long: Original filename after format adjustments.
file: Filename without the XML extension.
link_rawdata: Original filename as a link to raw data.
ID: Unique row identifier for each entry.
(Optional) The path to the directory containing the XML files.
(Default: ".xml") If not provided, the user will be prompted to choose a file path interactively.
This function reads XML filenames from a specified folder and splits their components into structured metadata fields. It processes the filenames to ensure a consistent format by replacing an underscore preceding the 4-digit sample number with a hyphen. The function then extracts key information (e.g., sample ID, experiment date, retention time) based on the UFZ FTMS naming conventions and outputs a tidy data.table.
The expected filename format is as follows:
Standard: 104B12_9557_RB3_10-12-2023_Segment1_1-2min.xml
Exception with additional underscore in the first part: srfa_mcs_9554_GA2_10-12-2023_Segment1_1-2min.xml
Other internal functions:
create_custom_formula_library(),
extract_aquisition_params(),
extract_aquisition_params_from_folder(),
read_xml_peaklist()