Amazon S3 is a file storage service. Files are stored in "buckets". S3 is not a
filesystem and does not explicitly support directories, but it allows you to
treat a bucket as though it has a directory structure. For example, if you have
a bucket with an item in it called "foo/bar/baz.txt", there is an implicit
"foo/" directory, with an implicit "bar/" directory under that.If a bucket has many thousands of objects in it, it can take a while to list
the entire contents, and these functions do not do so. These functions support
listing a subset of the bucket contents by a given prefix.
These listings are not recursive, but they provide the information
you need to interactively query the contents of a bucket.
getBucketUrls
is designed specifically for use with the
BiocCloud
package and the files in the "1000genomes" buckets.
In addition to the bare listing that listBucket
provides,
getBucketUrls
returns a named list containing a subset of
bucket contents (those files that contain the pattern
"ALL.chr[0-9]{1,2}"), where the names are the chromosomes
(e.g. "chr1"). The values in the list are fully qualified URLs
instead of just S3 key names.
See the vignette for a use case that takes advantage
of this.