Objective video quality metrics are designed to be as reliable as the subjective quality assessments on which they are calibrated and validated. However, existing standard methodologies for subjective video quality assessment provide low reliable results for some conditions. We investigate whether an extension of the quality ruler experimental methodology, originally defined for images and shown to be more reliable than, e.g., standard single stimulus (SS) methods, can be adapted to reliably assess the quality of videos. The video quality ruler methodology allows subjects to assess video quality using a set of reference anchor images (the ruler), spanning a wide range of quality altogether, but closely spaced in function of quality one from the other. Subjects are asked to compare the quality of the displayed test video with the quality of these anchor images, displayed on a tablet, and indicate which of the reference images matches in quality the test video. As a result, the video quality assessment task is reduced to a set of visual comparisons between video and reference image quality. We describe how to adapt the original quality ruler methodology to video quality assessment, and we compare the proposed methodology with two other, widely used experimental methodologies: the single stimulus (SS) and the double stimulus (DS) method. Our results show that video quality ruler is a reliable method to assess video quality according to a multitude of criteria.
It is important to understand how humans view images and how their behavior is affected by changes in the properties of the viewed images and the task they are given, particularly the task of scoring the image quality (IQ). This is a complex behavior that holds great importance for the field of image-quality research. This work builds upon 4 years of research work spanning three databases studying image-viewing behavior. Using eye-tracking equipment, it was possible to collect information on human viewing behavior of different kinds of stimuli and under different experimental settings. This work performs a cross-analysis on the results from all these databases using state-of-the-art similarity measures. The results strongly show that asking the viewers to score the IQ significantly changes their viewing behavior. Also muting the color saturation seems to affect the saliency of the images. However, a change in IQ was not consistently found to modify visual attention deployment, neither under free looking nor during scoring. These results are helpful in gaining a better understanding of image viewing behavior under different conditions. They also have important implications on work that collects subjective image-quality scores from human observers.
KEYWORDS: Video, Visualization, Feature selection, Video processing, Video compression, Multimedia, Image quality, Signal processing, Visual process modeling, Web 2.0 technologies
Recently, a lot of effort has been devoted to estimating the Quality of Visual Experience (QoVE) in order to optimize video delivery to the user. For many decades, existing objective metrics mainly focused on estimating the perceived quality of a video, i.e., the extent to which artifacts due to e.g. compression disrupt the appearance of the video. Other aspects of the visual experience, such as enjoyment of the video content, were, however, neglected. In addition, typically Mean Opinion Scores were targeted, deeming the prediction of individual quality preferences too hard of a problem. In this paper, we propose a paradigm shift, and evaluate the opportunity of predicting individual QoVE preferences, in terms of video enjoyment as well as perceived quality. To do so, we explore the potential of features of different nature to be predictive for a user’s specific experience with a video. We consider thus not only features related to the perceptual characteristics of a video, but also to its affective content. Furthermore, we also integrate in our framework the information about the user and use context. The results show that effective feature combinations can be identified to estimate the QoVE from the perspective of both the enjoyment and perceived quality.
KEYWORDS: Video, Visualization, Micro unmanned aerial vehicles, Video compression, Image quality, Eye, Image compression, Video processing, Video coding, Data processing
In digital video systems, impairments introduced during the capture, coding/decoding processes, delivery and display might reduce the perceived quality of the visual content. Recent developments in the area of visual quality have focused on trying to incorporate aspects of gaze patterns into the design of visual quality metrics, mostly using the assumption that visual distortions appearing in less salient areas might be less visible and, therefore, less annoying. Most of these studies, however, have considered the presence of a single artifact (e.g. blockiness or blur) impairing the image. In practice, this is not the case, as multiple artifacts may overlap, and their combined appearance may be strong enough to deviate saliency from its natural pattern. In this work, our focus is on measuring the impact and the influence of combinations of artifacts on the video saliency. For this purpose, we tracked eye-movements of participants in a subjective quality assessment experiment during a free-viewing and a quality assessment tasks. Results show that the gaze locations change from pristine videos to impaired videos. These changes seem to be more related to the quality level and content of videos than to the specific combination of artifacts.
In the past decades, a lot of effort has been invested in predicting the users’ Quality of Visual Experience (QoVE) in
order to optimize online video delivery. So far, the objective approaches to measure QoVE have been mainly based on
an estimation of the visibility of artifacts generated by signal impairments at the moment of delivery and on a prediction
of how annoying these artifacts are to the end user. Recently, it has been shown that other aspects, such as user interest
or viewing context, also have a crucial influence on QoVE. Social context is one of these aspects, but it has been poorly
investigated in relation to QoVE so far. In this paper, we report the outcomes of an experiment that aims at unveiling the
role that social context, and in particular co-located co-viewing, plays within the visual experience and the annoyance of
coding artifacts. The results show that social context significantly influences user’s QoVE, whereas the appearance of
artifacts doesn’t have impact on viewing experience, although users can still notice them. The results suggest that
quantifying the impact of social context on user experience is of major importance to accurately predict QoVE towards
video delivery optimization.
Manufacturers of commercial display devices continuously try to improve the perceived image quality of their products. By applying postprocessing techniques on the incoming signal, they aim to enhance the quality level perceived by the viewer. These postprocessing techniques are usually applied globally over the whole image but may cause side effects, the visibility and annoyance of which differ with local content characteristics. To better understand and utilize this, a three-phase experiment was conducted where observers were asked to score images that had different levels of quality in their regions of interest and in the background areas. The results show that the region of interest has a greater effect on the overall quality of the image than the background. This effect increases with the increasing quality difference between the two regions. Based on the subjective data we propose a model to predict the overall quality of images with different quality levels in different regions. This model, which is constructed on empirical bases, can help craft weighted objective metrics that can better approximate subjective quality scores.
KEYWORDS: Visualization, Visibility, Image quality, Visual process modeling, Video, Systems modeling, Multimedia, Electronic imaging, Imaging systems, Iterated function systems
The Electronic imaging community has devoted a lot of effort to the development of technologies that can predict the
visual quality of images and videos, as a basis for the delivery of optimal visual quality to the user. These systems have
been based for the most part on a visibility-centric approach, assuming the more artifacts are visible, the higher is the
annoyance they provoke, the lower the visual quality. Despite the remarkable results achieved with this approach,
recently a number of studies suggested that the visibility-centric approach to visual quality might have limitations, and
that other factors might influence the overall quality impression of an image or video, depending on cognitive and
affective mechanisms that work on top of perception. In particular, interest in the visual content, engagement and context of usage have been found to impact on the overall quality impression of the image/video. In this paper, we review these studies and explore the impact that affective and cognitive processes have on the visual quality. In addition, as a case study, we present the results of an experiment investigating on the impact of aesthetic appeal on visual quality, and we show that users tend to be more demanding in terms of visual quality judging beautiful images.
Visual quality is a multifaceted quantity that depends on multiple attributes of the image/video. According to Keelan's
definition, artifactual attributes concern features of the image that when visible, are annoying and compromise the
integrity of the image. Aesthetic attributes instead depend on the observer's personal taste. Both types of attributes have
been studied in the literature in relation to visual quality, but never in conjunction with each other. In this paper we
perform a psychometric experiment to investigate how artifactual and aesthetic attributes interact, and how they affect
the viewing behavior. In particular, we studied to what extent the appearance of artifacts impacts the aesthetic quality of
images. Our results indicate that indeed image integrity somehow influences the aesthetic quality scores. By means of an
eye-tracker, we also recorded and analyzed the viewing behavior of our participants while scoring aesthetic quality.
Results reveal that, when scoring aesthetic quality, viewing behavior significantly departs from the natural free looking,
as well as from the viewing behavior observed for integrity scoring.
Research has shown that when viewing still images, people will look at these images in a different manner if instructed
to evaluate their quality. They will tend to focus less on the main features of the image and, instead, scan the entire image
area looking for clues for its level of quality. It is questionable, however, whether this finding can be extended to videos
considering their dynamic nature. One can argue that when watching a video the viewer will always focus on the
dynamically changing features of the video regardless of the given task. To test whether this is true, an experiment was
conducted where half of the participants viewed videos with the task of quality evaluation while the other half were
simply told to watch the videos as if they were watching a movie on TV or a video downloaded from the internet. The
videos contained content which was degraded with compression artifacts over a wide range of quality. An eye tracking
device was used to record the viewing behavior in both conditions. By comparing the behavior during each task, it was
possible to observe a systematic difference in the viewing behavior which seemed to correlate to the quality of the
videos.
Reliably assessing overall quality of JPEG/JPEG2000 coded images without having the original image as a reference is still challenging, mainly due to our limited understanding of how humans combine the various perceived artifacts to an overall quality judgment. A known approach to avoid the explicit simulation of human assessment of overall quality is the use of a neural network. Neural network approaches usually start by selecting active features from a set of generic image characteristics, a process that is, to some extent, rather ad hoc and computationally extensive. This paper shows that the complexity of the feature selection procedure can be considerably reduced by using dedicated features that describe a given artifact. The adaptive neural network is then used to learn the highly nonlinear relationship between the features describing an artifact and the overall quality rating. Experimental results show that the simplified feature selection procedure, in combination with the neural network, indeed are able to accurately predict perceived image quality of JPEG/JPEG2000 coded images.
Several attempts to integrate visual saliency information in quality metrics are described in literature, albeit with
contradictory results. The way saliency is integrated in quality metrics should reflect the mechanisms underlying the
interaction between image quality assessment and visual attention. This interaction is actually two-fold: (1) image
distortions can attract attention away from the Natural Scene Saliency (NSS), and (2) the quality assessment task in itself
can affect the way people look at an image. A subjective study was performed to analyze the deviation in attention from
NSS as a consequence of being asked to assess the quality of distorted images, and, in particular, whether, and if so how,
this deviation depended on the distortion kind and/or amount. Saliency maps were derived from eye-tracking data
obtained during scoring distorted images, and they were compared to the corresponding NSS, derived from eye-tracking
data obtained during freely looking at high quality images. The study revealed some structural differences between the
NSS maps and the ones obtained during quality assessment of the distorted images. These differences were related to the
quality level of the images; the lower the quality, the higher the deviation from the NSS was. The main change was
identified as a shrinking of the region of interest, being most evident at low quality. No evident role for the kind of
distortion in the change in saliency was found. Especially at low quality, the quality assessment task seemed to prevail
on the natural attention, forcing it to deviate in order to better evaluate the impact of artifacts.
This paper presents a novel system that employs an adaptive neural network for the no-reference assessment of perceived
quality of JPEG/JPEG2000 coded images. The adaptive neural network simulates the human visual system as a black
box, avoiding its explicit modeling. It uses image features and the corresponding subjective quality score to learn the
unknown relationship between an image and its perceived quality. Related approaches in literature extract a considerable
number of features to form the input to the neural network. This potentially increases the system's complexity, and
consequently, may affect its prediction accuracy. Our proposed method optimizes the feature-extraction stage by
selecting the most relevant features. It shows that one can largely reduce the number of features needed for the neural
network when using gradient-based information. Additionally, the proposed method demonstrates that a common
adaptive framework can be used to support the quality estimation for both compression methods. The performance of the
method is evaluated with a publicly available database of images and their quality score. The results show that our
proposed no-reference method for the quality prediction of JPEG and JPEG2000 coded images has a comparable
performance to the leading metrics available in literature, but at a considerably lower complexity.
The Single Stimulus (SS) method is often chosen to collect subjective data testing no-reference objective metrics, as it is
straightforward to implement and well standardized. At the same time, it exhibits some drawbacks; spread between
different assessors is relatively large, and the measured ratings depend on the quality range spanned by the test samples,
hence the results from different experiments cannot easily be merged . The Quality Ruler (QR) method has been
proposed to overcome these inconveniences. This paper compares the performance of the SS and QR method for
pictures impaired by Gaussian blur. The research goal is, on one hand, to analyze the advantages and disadvantages of
both methods for quality assessment and, on the other, to make quality data of blur impaired images publicly available.
The obtained results show that the confidence intervals of the QR scores are narrower than those of the SS scores. This
indicates that the QR method enhances consistency across assessors. Moreover, QR scores exhibit a higher linear
correlation with the distortion applied. In summary, for the purpose of building datasets of subjective quality, the QR
approach seems promising from the viewpoint of both consistency and repeatability.
Manufacturers of commercial display devices continuously try to improve the perceived image quality of their products.
By applying some post processing techniques on the incoming image signal, they aim to enhance the quality level
perceived by the viewer. Applying such techniques may cause side effects on different portions of the processed image.
In order to apply these techniques effectively to improve the overall quality, it is vital to understand how important
quality is for different parts of the image. To study this effect, a three-phase experiment was conducted where observers
were asked to score images which had different levels of quality in their saliency regions than in the background areas.
The results show that the saliency area has a greater effect on the overall quality of the image than the background. This
effect increases with the increasing quality difference between the two regions. It is, therefore, important to take this
effect into consideration when trying to enhance the appearance of specific image regions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.