For a list of records, construct a data.frame for insertion into SQL database.
gb_df_generate(records, min_length = 0, max_length = NULL)
character, vector of GenBank records in text format
Minimum sequence length, default 0.
Maximum sequence length, default NULL.
data.frame
The resulting data.frame has five columns: accession, organism, raw_definition, raw_sequence, raw_record. The prefix 'raw_' indicates the data has been converted to the raw format, see ?charToRaw, in order to save on RAM. The raw_record contains the entire GenBank record in text format.
Use max and min sequence lengths to minimise the size of the database. All sequences have to be at least as long as min and less than or equal in length to max, unless max is NULL in which there is no maximum length.
Other private: add_rcrd_log
,
cat_line
, char
,
check_connection
, cleanup
,
connected
, connection_get
,
custom_download2
,
custom_download
,
db_sqlngths_get
,
db_sqlngths_log
, dir_size
,
dwnld_path_get
,
dwnld_rcrd_log
,
entrez_fasta_get
,
entrez_gb_get
,
extract_accession
,
extract_by_patterns
,
extract_clean_sequence
,
extract_definition
,
extract_features
,
extract_inforecpart
,
extract_keywords
,
extract_locus
,
extract_organism
,
extract_seqrecpart
,
extract_sequence
,
extract_version
,
file_download
, filename_log
,
flatfile_read
, gb_build2
,
gb_build
, gb_df_create
,
gb_sql_add
, gb_sql_query
,
gbrelease_check
,
gbrelease_get
, gbrelease_log
,
has_data
,
identify_downloadable_files
,
last_add_get
, last_dwnld_get
,
last_entry_get
,
latest_genbank_release_notes
,
latest_genbank_release
,
message_missing
, mock_def
,
mock_gb_df_generate
,
mock_org
, mock_rec
,
mock_seq
, predict_datasizes
,
print.status
, quiet_connect
,
readme_log
,
restez_path_check
, restez_rl
,
seshinfo_log
, setup
,
slctn_get
, slctn_log
,
sql_path_get
, status_class
,
stat
, testdatadir_get