The dataset contains over 63,000 reported sightings spanning several
decades and includes information on sighting date, geographic location,
duration, narrative comments, and—most importantly for nomiShape—
the reported shape of the observed object.
The shape variable is a nominal variable with many categories
(e.g., "light", "circle", "triangle", "sphere"), exhibiting strong
dominance by a few common shapes followed by a gradual decline across
rarer categories. Despite the presence of a highly frequent leading
category ("light"), the overall frequency structure is better described
as triangular or normal-like rather than strictly exponential or Pareto.
This dataset is included as a realistic, large-sample example for
exploring dominance, modality, and shape classification of nominal
distributions using visual and information-theoretic tools.