Paper
20 April 1995 Video and audio data integration for conferencing
Thrasyvoulos N. Pappas, Raynard O. Hinds
Author Affiliations +
Proceedings Volume 2411, Human Vision, Visual Processing, and Digital Display VI; (1995) https://doi.org/10.1117/12.207533
Event: IS&T/SPIE's Symposium on Electronic Imaging: Science and Technology, 1995, San Jose, CA, United States
Abstract
In videoconferencing applications the perceived quality of the video signal is affected by the presence of an audio signal (speech). To achieve high compression rates, video coders must compromise image quality in terms of spatial resolution, grayscale resolution, and frame rate, and may introduce various kinds of artifact.s We consider tradeoffs in grayscale resolution and frame rate, and use subjective evaluations to assess the perceived quality of the video signal in the presence of speech. In particular we explore the importance of lip synchronization. In our experiment we used an original grayscale sequence at QCIF resolution, 30 frames/second, and 256 gray levels. We compared the 256-level sequence at different frame rates with a two-level version of the sequence at 30 frames/sec. The viewing distance was 20 image heights, or roughly two feet from an SGI workstation. We used uncoded speech. To obtain the two-level sequence we used an adaptive clustering algorithm for segmentation of video sequences. The binary sketches it creates move smoothly and preserve the main characteristics of the face, so that it is easily recognizable. More importantly, the rendering of lip and eye movements is very accurate. The test results indicate that when the frame rate of the full grayscale sequence is low (less than 5 frames/sec), most observers prefer the two-level sequence.
© (1995) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Thrasyvoulos N. Pappas and Raynard O. Hinds "Video and audio data integration for conferencing", Proc. SPIE 2411, Human Vision, Visual Processing, and Digital Display VI, (20 April 1995); https://doi.org/10.1117/12.207533
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Image segmentation

Laser induced plasma spectroscopy

Video compression

Binary data

Spatial resolution

Temporal resolution

Back to Top