The coding objective of image and video that are targeted for machine consumption may differ from that for human consumption. For example, machine may only use a part of image or video requested or required by an application whereas human consumption requires whole captured area of image and video. In addition, machine may require grayscale or certain light spectrum, whereas human consumption requires full visible light spectrum. To identify an object of interest, a neural network based image or video analysis task may be performed and the output of a task is an identified feature (latent) and an associated descriptor (inference). Depending on the usage, multiple tasks can be performed in parallel or in series, and as a number of identified feature increases, the chance of feature area overlap increases as well. We propose a pipeline of descriptor based video coding for machine for multi-task. The proposed method is expected to increase coding efficiency when multiple tasks are performed, by minimizing redundant encoding of overlapped area of objects of interest and to increase utilization and re-utilization of features by transmitting inference separately.
Sub-hologram based holographic display method is one of the most practical approaches for realizing big size
holographic display. However, this method needs highly accurate face and eye tracking function in real-time to enable
precise steering of backlight and generation of corresponding sub-hologram for each video frame. We theoretically
estimated several parameters, such as accuracy, speed and distance from an observer, required for the eye tracking
function and developed an eye tracking system whose objective is accurate and fast 3D positioning of left and right
pupils of an observer. Experimental results show that the system obtains accurate 3D pupil positions with an error less
than 3 mm at 30 frames per second under disturbing conditions such as more than 2m distance and an observer wearing
glasses. Therefore, our implementation could be sufficiently applied to the sub-hologram based display system.
We propose a content-based summary generation method using MPEG-7 metadata. In this paper, the important events of video are defined and subsequently shot boundary detection is carried out. Then, we analyze the video contents in the shot with multiple content features using multiple MPEG-7 descriptors. In experiments with a golf-video, we combined motion activity, edge histogram and homogeneous texture for the detection of event. Further, the extracted segments and key-frames of each event are described by XML document. Experimental result shows that the proposed method gives reliable summary generation with robust event detection.
We propose an image filtering technique for information filtering agent system. In this paper, contents based image filtering technique is proposed. In the proposed technique, content description of MPEG-7 is adopted into the image filtering. To verify the usefulness of the proposed method, an image-filtering agent system is developed on the network layer. MPEG-7 texture and color descriptors were employed as a content description. And MPEG-7 encoding the descriptors was performed just after receiving all packets of image data. Experimental result shows that the similarity-filtering ratio of the proposed method is much higher than that of conventional method without any cost of network speed.
In this paper, a multimedia database system is proposed using MPEG-7 meta data. Multimedia content based retrieval system is implemented with the MPEG-7 meta data by use of a data hiding technique. MPEG-7 descriptor and descriptor scheme are hidden into the original data using data hiding and watermarking technique. The hidden data is used as a query for the multimedia indexing/retrieval system. In this paper, color and texture descriptors and their descriptor scheme are used for the MPEG-7 multimedia database. To verify the usefulness of the proposed descriptor for contents featuring of texture, computer simulations and experiments with MPEG-7 image database were performed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.