Automatic image annotation (AIA), a highly popular topic in the field of information retrieval research,
has experienced significant progress within the last decade. Yet, the lack of a standardized evaluation
platform tailored to the needs of AIA, has hindered effective evaluation of its methods, especially for
region-based AIA. Therefore in this paper, we introduce the segmented and annotated IAPR TC-12 benchmark;
an extended resource for the evaluation of AIA methods as well as the analysis of their impact on
multimedia information retrieval. We describe the methodology adopted for the manual segmentation
and annotation of images, and present statistics for the extended collection. The extended collection is
publicly available and can be used to evaluate a variety of tasks in addition to image annotation. We also
propose a soft measure for the evaluation of annotation performance and identify future research areas in
which this extended test collection is likely to make a contribution.