Abstract
Significant progress has been made on the problem of designing objective blind image quality assessment (IQA) models that are consistent with human subjective quality evaluations. However, it is vital to be able to validate the performance of every algorithm on extensive, highly diverse ground truth data. Existing image datasets are limited by their size, simulated distortions and their severities, and the human opinion scores are generally collected on a single device having a fixed display resolution and a fixed viewing distance. These limitations motivated us to design and create a new image quality database that models realistic distortions captured using a wide variety of commercial devices and which includes diverse artifacts. We also designed and implemented a new online crowdsourcing system using Amazon's Mechanical Turk, which we have used to conduct a very large-scale IQA subjective study, wherein a wide range of diverse observers record their judgments of image quality. Thus far we have collected over 40,000 human judgments on about 1200 naturally distorted images from over 1000 distinct subjects. The study is ongoing and we plan to collect more than 300,000 subjective judgments overall, making it the world's largest, most comprehensive study of perceptual image quality ever conducted. Furthermore, we have conducted a statistical analysis of the ratings obtained on images from users who viewed them on different devices and from different distances to study the impact of these factors on perceptual quality. We have evaluated several IQA algorithms in regards to their ability to reliably predict the visual quality of the images from our growing database. Thus far we have found that existing blind IQA algorithms have significant room for improvement towards being able to accurately predict the quality of the images suffering from diverse real world distortions that are contained in our database (Table 1).
Meeting abstract presented at VSS 2014