Video data has occupied people’s daily professional and entertainment activities. It imposes a big pressure on the Internet bandwidth. Hence, it is important to develop effective video coding techniques to compress video data as much as possible and save the transmission bandwidth, while still providing visually pleasing decoded videos. In conventional video coding such as the high efficiency video coding (HEVC) and the versatile video coding (VVC), signal processing and information theory-based techniques are mainstream. In recent years, thanks to the advances in deep learning, a lot of deep learning-based approaches have emerged for image and video compression. In particular, the generative adversarial networks (GAN) have shown superior performance for image compression. The decoded images are usually sharper and present more details than pure convolutional neural network (CNN)-based image compression and are more consistent with human visual system (HVS). Nevertheless, most existing GAN-based methods are for still image compression, and truly little research investigates the potential of GAN for video compression. In this work, we propose a novel inter-frame video coding scheme that compresses both reference frames and target (residue) frames by GAN. Since residue signals contain less energy, the proposed method effectively reduces the bit rates. Meanwhile, since we adopt adversarial learning, the perceptual quality of decoded target frames is well-preserved. The effectiveness of our proposed algorithm is demonstrated by experimental studies on common test video sequences.
KEYWORDS: Video compression, Video coding, Video, Visualization, Visual compression, Video processing, Motion estimation, Electronic components, Data storage
Video coding is the process of reducing the huge volume of video data to a small number of bits. High coding efficiency reduces the bandwidth required for video streaming, and the space required to store the video data on electronic devices, while maintaining the fidelity of the decompressed video signal. In recent years, deep learning has been extensively applied in the field of video coding. However, it remains challenging how to explore the intra- and inter-frame correlations in deep learning-based video coding systems to improve the coding efficiency. In this work, we propose a hierarchical motion estimation and compensation network for video compression. The video frames are tagged as intra-frames and inter-frames. While intra-frames are compressed independently, the inter-frames are hierarchically predicted by adjacent frames using a bi-directional motion prediction network, which results in highly sparse and compressible residue. The residue frames are then compressed via separately trained residue coding networks. Experimental results demonstrate that the proposed hierarchical deep video compression network offers significantly higher coding efficiencey and superior visual quality compared to prior arts.
This paper presents an innovative method to align two handwritten Chinese characters modeled by three-dimensional point sets. A piecewise cubic Bézier curve is first defined for each character stroke, interpolating the 2D sample points. The density of sample points is then adjusted along the curve for even distribution. Later all 2D points are extended into 3D space to integrate additional geometry information. Registration is then performed between two 3D point sets using Gaussian Mixture Model (GMM). This paper establishes the registration by minimizing the Euclidean Distance(L2) between two GMMs and then apply transformation accordingly. The key idea of this process is to include extra geometric information of the calligraphy to facilitate the registration process, using simple algorithms. Through the alignment and transformation process, the global layout of character is adjusted, but with local features retained. This technique can be applied in the morphing and beautification of Chinese calligraphy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.